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PREFACE 


This book has been written for the use of Indian students , it covers 
a portion of the Post-Graduate Pure Mathematics course of the University 
of Calcutta The interest m modem algebra has much increased all over 
India in recent years, but the lack of a textbook in the English language made 
it difficult to introduce the subject in the regular courses The author 
expects that this book will help to enhance the popularity of algebra at 
Indian Universities, and the encouragement which he reoeived from friends 
in U S A makes him hope that the new textbook will be useful even out- 
side India 

The urgent need for a book of this kind was felt first when a new course 
in algebra was introduced in Calcutta 1936 The students were not able 
to follow the lectures without some kind of textbook m their hands. Thus 
it was necessary to issue lecture -notes* in small instalments which were 
printed by the University Press as quickly as possible In spite of many 
small deficiencies due to the circumstances under which those lecture-notes 
were published, ^iey could be used as a basis for the instruction m algebra 
in the Post-Graduate Department of Pure Mathematics of our University 
during six academic years The present book can therefore be considered 
as a second edition, completely revised m accordance with the experience 
obtained by the teaching 

I am much indebted to two lecturers who did most of the teaching m 
algebra during the last few years, Dr Rabindranath Sen and Mr Rajchandra 
Bose, M A for letting me participate in the expenence which they obtained 
by lecturing to general and tutorial classes according to those lecture-notes. 
During the last session, typed copies of the present book where already m 
the hands of those scholars, and many small alterations were made on 
their advice I had also much benefit from conversations -with students, 
especially those who after having completed the course continued to study 
mathematics as research scholars under my guidance ; they remembered 
very well the passages of the lecture-notes which used to be moat difficult 
to them Although modern algebra is not a difficult subject, it requires 
some change of mind from students whose previous training was in classical 
mathematics only They are familiar with investigations on interesting 

* Algebra, lectures delivered to post-graduate students, Part I-V, by F. W. Levi 
1036-37. 
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mathematical entities, but they are not used to considerations about relations 
between objects which are indeterminate No wonder that the students 
felt uneasy when the course was introduced , they did not realise the “raison 
d’etre” of the subject Gradually, bewilderment gave way to enthusiasm, 
especially among the better students , when the first two-years course was 
completed, some of them suggested to me to omit — or at least restrict — such 
subjects as continued fractions and approximation of roots, and to give 
more of modern algebra, but I was unable to follow this advice 

It was the author’s intention to keep an equal balance between modern 
and classical portions of algebra , he imposed on himself the greatest restric- 
tion in the use of notions and notations of more recent origin, and he limited 
his programme on general algebra by keeping the notions of ideal and of non- 
commutative group outside These subjects will bo treated in the second 
volume Actual teaching-experience made him modify this plan , one will 
find m this book some notations like integral domain, Euclidean domain 
etc , which have not been used in the lecture-notes, but which proved very 
useful in the class The self-imposed limitations concerning theory of 
groups have been shown to be too rigid The general notion of group is 
an essential part of every reasonable teaching of geometry , by this argument, 
the author’s junior friends and disciples convinced him that no additional 
burden would be imposed on the students by including this notion in the 
compulsory algebra- course 

In a systematical representation of the subject (e g van der Waerden’s 
“Moderne Algebra”) one starts from those notions which have been proved 
to be fundamental , when the whole system has been built up, one sees 
the reasons for eveiy step, and one feels much admiration A methodical 
discussion starts with examples, the notions being introduced successively 
at those points where their usefulness becomes obvious In this respect 
too, the author follows a middle way Chapter I deals with the solutions 
of systems of linear equations The importance of this problem is obvious , 
the results are applied in the course on geometry which the students as a 
rule follow during the same session Since the introduction of new notions 
like vector, vectorspace etc , is shown to be very helpful, the reader may get 
the necessary confidence to dive into the very abstract investigations of 
Chapter II which find their applications in Chapter III These considera- 
tions on general algebra are continued m Chapter VI where matrices are 
studied. The Chapters IV and V deal with continued fractions and with 
approximative calculations It is possible to treat all classical problems 
from the point of view of general algebra Of course one can consider the 
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field of the real numbers as a special case of the formally real fields ; a sub- 
stantial portion of the continuum-algebra can be shown to hold in these 
fields The notion of formally real fields has not been used in this book , 
the author has preferred to respect the autonomy of the theory of numerical 
calculation This theory originates from the needs of the computor, and 
it has been treated here accordingly — starting from a very simple principle 
of calculation, Horner’s scheme Whereas in the fifth chapter very little 
reference is made to the principles of general algebra, the continued fractions 
(Chapter IV) are treated so to say in a half-classical manner It appears 
that the suppositions which are usually made m this theory are not all ne- 
cessary, and that they afford interesting generalisations This kind of treat- 
ment has been suggested to the author, not by papers on general algebra, 
but by a book on numerical calculation (Runge-Koenig) 

Some readers may be astonished to find two dialogues in this book 
Needless to say that the two characters — tutor and student — do not re- 
present any ndividuals , nor is the “student” a true picture of the average 
Indian mathematics student of the present time He is an international 
creature, and some of the remarks originate from the difficulties which the 
author himself had to face when a student more than 30 years ago There 
are certain items m mathematics which can best be made clear by a frank 
discussion , the author wants to encourage this form of teaching by giving 
these two specimens of discussions between a teacher and his pupil The 
“introductory remarks” (Chapter zero) are meant to refresh the memory 
of a few subjects which are supposed to be known by the students joining 
the post-graduate classes 

Western mathematicians may wonder why the “method of identification” 
is discussed heie in such an explicit manner This item which was not 
treated m the lecture-notes has been introduced into the book because the 
experience of teaching showed its necessity This is the only occasion where 
I came across an essential difference between the Indian way of think- 
ing in mathematics and the western one It seems that the western nnnd 
performs so to say automatically the operation of identification , even 
Fdmund Landau whose rigour and explicitness have become proverbial 
used to pass it over without explanation I remember only a single case 
w'here I had to discuss this item with a student of Leipzig University, and 
at that occasion it w r as not my task to clear up the difficulty, but to show 
that there is one When introducing the new course on algebra m Calcutta 
I did not like to burden it with considerations w hich m Europe were thought 
to be unnecessary sophisteries, and I was very astonished that every year, 
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the students felt difficulties and asked for explanations at that particular 
point I am stating this experience here without feeling competent to 
explain it Scholars on Indology may find some clue in ancient Indian logic 
— though very few of our mathematics students have an explicit knowledge 
of it Similarly, I must leave it to Indian scholars to explore why western 
people fail to recognise a difficulty which is so obvious even to an average 
mathematics student in this country. 

In offering this book to the public, I have much pleasure to thank for 
their generous help the authorities of the University of Calcutta the 
Senate, the Syndicate and the three Vice-chancellors who were m charge of 
the University since the publication of the lecture-notes was undertaken m 
1936 In particular I am obliged to the President of the Council of 
Post-Graduate Teaching in Arts, the Hon’ble Dr Shyama Prasad Mookerjec 
for his kind understanding and his energetic support without which it would 
not have been possible to bring out this book in an adequate form during 
the present emergency 

The assistance which I received from lecturers and research -scholars 
of the Department of Pure Mathematics has already been gratefully 
acknowledged One of them, Bankim Chandra Chatterjee, M Sc must be 
mentioned especially for having assisted the author m the correction of the 
proof-sheets from beginning to end. The Eka Press has printed the book 
with great care and ability 


Bmsar (Himalya), 
June 1st 1942 


F. W LEVI 



INTRODUCTORY REMARKS 


0-1. Odd and even numbers. 

The notions “odd” and “even” apply to positive as well as to non- 
positive integral numbers, though they were used originally for positive ones 
only A number which is the double of an integral number is said to be 
even A non-even integral number is odd Hence zero is even , if a is even 
(odd), —a is even (odd) and a±l is odd (even) The integral numbers form 
a double sequence where the elements are alternately even and odd The 
distinction between even and odd — though it involves a very simple 
principle— plays an important role in Algebra and in other branches of 
Mathematics 

The sum as well as the differences of two odd (even) numbers are even , 
the sum and the difference of one odd and one even number are odd To 
generalise these propositions, consider 

M = S m,, (1) 

j=i 

I<et q terms of the sum be odd, and the remaining n—q terms be even numbers. 
Put Wj = 2fj -f 1 if njj is odd 

m j —2r ] if Wj is even , then 

M = X 2r } + q (2) 

i=i 

is odd or even according as q is odd or even. I e. 

If in a sum of integral numbers exactly q terms are odd, the sum is odd or even, 
according as q is odd or even 
In particular, put q=n 

A sum of n odd numbers is odd or even, according as n is odd or even. 

Fxercise. Let m objects be arranged m a linear manner, and let them 
be rearranged to a second position Hereby a objeots move forward to the 
right whereas b objects recede to the left Out of the a elements moving 
forward, a x are moved by an even number of places, a a by an odd number. 
Similarly 6, obj'ects recede by an even and b a by an odd number of places. 
Prove that a a and 6, are either both odd, or both even. 



2 


ALGEBRA I 


0-2 Mathematical induction 

In life as well as in experimental Science one uses to make conclusions 
in the following manner A certain observation is made in a large number 
of particular cases, and from these statements one concludes that there 
exists a general rule This form of conclusion is called “conclusion by 
induction” A statement as e g “Palm trees grow higher than bananas” 
is based on an experience got from a restricted number of plants of 
both the species We shall not investigate here why conclusions of this 
type are justified m many cases In mathematics, ordinary induction 
is not admissible To attain general rules, a form of conclusion is often 
applied which for its apparent similarity with ordinary induction is called 
“ mathematical induction ” 

Principle of mathematical induction Let S (n) be a statement concern- 
ing the positive integral numbers n — 1, 2, , and (1) let 8(1) be 

a true statement , (2) let it be possible to demonstrate that if >S(m) is a 

true statement, then S(/a + 1) is also a true statement, then S(n) holds 
for every positive integral value of n 

Proof Let S(w) be not true for every positive integral value, then 
there exist positive integral values for wliuh the statement docs not hold 
Among these values there is one smallest integral number, say a + 1 As 
S(l) is supposed to hold, a is a positive number Thus S(«) holds, but 
S (a + 1) does not hold contrary to the supposition (2) , hence S(w) holds 
for every positive integral value of n 

Corollary Tf S(w) is true foi n — »,,, and if supposition (2) of the above 
proposition holds, then S (n) holds for n ? ?i„ 

0-3 Permutations 

0-31 Repiesentation of pet mutations Given n distinct objects, say 

1,2, , n (1) 

Consider the transformations by which the objects (1) are interchanged 
These transformations are called permutations Any permutation is uni- 

*As a matter of fact the task of mathematics does not consist niamly in the enun- 
ciation of statements, but in showing the logical necessity of these statements A 
mathematical statement based on ordinary induction may be true, but it is of little 
mathematical value, unless it gets a logical foundation An example of an obviously 
false statement attained by induction is this “The number 2520 is divisible by every 
integral number”. Of course it is divisible by 1, 2, 3, 4, 6, 6, 7, 8, 9, 10, and by some 
more numbers which may have been taken at random, say 20, 56, 126, 420, 630 , thus 
one may falsely conclude the truth of the statement by induction. 
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quely determined when it is known into which element each individual 
element of (1) is transformed Hence a permutation can be represented by 
the help of two lines, each of them containing the n elements (1), ordered 
in such a way that below every element, say k, in the upper line is written 
the element a k into which k is transformed 



It is not necessary that the digits (1) are given in their natural order in the 
first line of the scheme (2) They may be interchanged in any manner 
if the same change is made in the second line, the only essential thing is that 
a particular element a k is put below a particular k to signify that k has to 
be replaced by a k To check whether two given permutations given m 
any manner are identical or not, one may interchange the vertical columns 
in the corresponding schemes (2) m such a way that the digits in the first 
lines have the natural order in both the cases If after this operation the 
second lines are the same for the two permutations, these are identical, other- 
wise different It is however not always convenient to arrange the digits 
of the first line in their natural order E g there exists to every permutation 
A an inverse permutation 



The connection between A and A' 1 is such that if the permutation A replaces 
any object c by an object d, then A - ' replaces d by c Hence if one inter- 
changes the objects (1) at first according to A and then according to A' 1 , 
as a result every object lemams unaltered The permutation not altering 
anv obiect must indeed also be considered as a permutation . it is called 
the identical permutation, or the identity 



Furthermore, the following permutations aie of special inteiest 


Cyclic permutations 



in particular cyclic permutations with m = 2 are called transpositions 


/ a, b, c lt 
\ b, a, c v 


• > ^n- 


•l C n _2 


) 


= («. b) 


( 4 ) 


(«) 
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The transposition (5) is therefore only an interchange of the two objects 
a and b ; the notations on the right hand side of (4) and (5) will be generalised 
in 0-33. 

0-32 Composition of permutations Consider 3 permutations of the 
objects 0-31, (1) say 

*-c> -(!>"(:> . 

As o k takes all the values 1, . n, one can also denote 



If one performs therefore at first the permutation A, and then the permuta- 
tion B, any object k is replaced by b^. 

The permutation attained in this manner is said to be composed of A and 
B and will be denoted here as a product 



The products B A and A B are in general different Readers may wonder 
why the order of the transformations appears in the notation of the product 
written from the right to the left This manner of notation has its analogues 
m other branches of mathematics, eg <j> f(x) means that x should be 
represented by y = f{x), and then y by <f>[y) For this similarity, the nota- 
tion used here is sometimes called a “functional” manner of notation 
Many authors use an inverse method of notation proceeding from the left 
to the right It is a mere convention which notation to follow, as both 
ways of notation are equivalent and have their (purely formal) advantages 
and disadvantages 

Applying formula (1) to the products CB, (CB)A, C(BA), one gets 

= (CB)A = C(BA) (2) 

K 

Hence one can omit the brackets on the right hand side, and denote (2) 
by C B A This permutation is attained by performing at first A, then 
B, and finally C From (2) follows . 
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For the composition of permutations the associative law holds 
Applying (2) to {2') and (3) of 0-31, one gets for every permutation A 

AJ = JA = A (3) 

AA' 1 = A" 1 A = J. (4) 

Let furthermore A and B be any two permutations, and 

AX = B, YA = B ; (5) 

by multiplying A -1 from the left (right) hand side and applying (2) and (4) 
one gets 

X = A 1 B, Y = BA- 1 . (6) 

On the other hand (6) is a solution of (5), thus the equations (5) possess one 
and only one solution 

0-33 Decomposition of permutations A permutation can be re- 
presented as a preduct of permutations of a special type Two important 
ways of representation will be discussed here 

Theorem 1 Every permutation of n > 1 objects can be represented 
as a product of transpositions 

Proof (By mathematical induction) For n = 2 the theorem is obvious. 
Suppose the theorem to be true for n — m — 1 , then it holds also for those 
permutations of m objects where at leasf one object remains unaltered 
Let A be an arbitrary permutation of m objects, and let by A, the object 
a be changed into b Then a is unaltered by A' — (a, b) A, and therefore 
A' is a product of transpositions Hence 

A = J A = (a, b) (a, b) A = (a, b) A' 
is a product of transpositions 

Exercises (1) Prove that the number of the different permutations 
of n objects is equal to n i 

(2) Represent J as a product of two transpositions 

(3) Show that A ^ J can be represented as product of transpositions 
which are less than n in number 

The representation of A as a product of transpositions is not unique. 
One can eg. perform any number of transpositions and then arrange 
systematically (see ex 3) We shall now consider a different representa- 
tion which is unique 
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Let A be any particular permutation, and let a l be transformed by 
A into a,, again a, into a, etc The sequence 

® 2 ’ * 

is infinite, every element in it determines uniquely the following as well 
as the preceding one, hence — as it contains only a finite number of 
different elements — it is a periodic sequence The elements, say 

«„ a 3 , , a m i (1) 

are interchanged among themselves in a cyclic order Let b, be any object 
subject to the permutation and not belonging to the cycle (1), then b l is 
transformed into an element b 2 , which does not belong to (1), and so 6, 
generates a cycle 6,, . . , b mt> which has no element in common with (1) This 
procedure can be repeated till it stops after rgti steps The permutation 
therefore generates a partition of the given objects into r cycles 

K> , a m J (d lt , d n , r ), (2) 

where 1 

In every cycle the order of the elements is determined up to a cyclic permu- 
tation which remains arbitrary, and the cycles can be interchanged among 
themselves but for a given A, every object determines its cycle uniquelv , 
hence A determines (2) uniquely The objects which are not displaced by 
A form cycles by themselves each For abbreviation, those cycles with one 
element only are often omitted The notation introduced in 0-31, (4) 
and (5) appears now to be a special ease of the notation introduced here 
On the other hand, (2) can be considered as a product of the cyclic permu- 
tations corresponding to its cycles Hence 

Theorem 2 Any permutation A can be represented as a product 
(2) of cyclic permutations m such a way that different factors displace 
different objects This representation is unique except for the order of the 
factors which remains arbitrary 

0-34 Even and odd permutations 

Definition The permutation A is said to be even or odd according 
as the number of eien cycles m 0-33, (2) is even or odd 

Theorem Every product of an even (odd) number of transpositions 
is even (odd) 
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Proof The theorem is obvious if the number of the transpositions 
is zero or one By the principle of mathematical induction one has there- 
fore only to prove that by composition of a permutation A from the left 
with a transposition, an even A becomes odd and conversely Let A be 
represented by cycles as m (2) the cycles with one element only (if any) 
also being eonsideied, and let (a lt b t ) be the transposition Then either 
a } and 6, occur in the same cycle of A, or in two different cycles Now 

(a„ &,) (a lt , b t , , &„) — (a,, ,a r ) (6,, ,b s ) 

Multiplying from the left with (u,, b l ) and interchanging the two sides one 
gets 

(a l ,b 1 )(a ll ,a,)(b v A) = (a v . ,a t ,b v ,b h ) 

As the other cycles are unaltered, the left hand factor (n 1) b ,) effect* either 
a partition of a cycle into two cycles, or conversely an amalgamation of 
two cycles into one An odd cycle is partitioned into one even and one odd, 
an even cycle into either two odd or two even ones Hence the number 
of even cycles changes by ±1 Similarly for the amalgamation as it is the 
convcise operation 

Corollary (1) A peimutation wliuh can be repicscnted as a product 
of an even (odd) number of transpositions cannot be represented as a pro- 
duct ol an odd (even) number of transpositions 

Corollary (2) Bv composing two even (odd) permutations one gets 
an even permutation, by composing one even and one odd permutation 
one gets an odd permutation 

Exercise* (1) Write down some pci mutations, and investigate whether 
they are even or odd 

(2) Show that every even permutation of n > 2 objects can be re- 
presented by composing suitable cyclic permutations of 3 objects each 

(3) An even cycle is an odd permutation, an odd cycle is an even 
permutation 



CHAPTER I 


In this chapter systems of linear equations 
<*i *1 + * a + • . + a n x n = a 
b i x 1 - ■ "i ~ b n x H — b 

k i x l + h t x t + . + K x n — k 

will be considered 

1-1 Introduction — A dialogue 

Student The problem enunciated at the beginning of this chapter 
seems to be a very easy one, but I have seen such words as “vectorspace”, 
“rank”, “matrix” later m the book , I also noticed determinants and 
formulas with upper and lower indices I cannot understand why the 
author is trying to make a very simple thing so complicated The problem 
can be solved with the help of methods which I learned, when I read for 
the matriculation examination 

Tutor, Of course, it is my duty to help you to understand this theory 
clearly, but Mathematics is not a matter of senionty. History shows se- 
veral examples where mathematicians were superior to their masters at a 
very early age of life I should not miss the opportunity to learn some- 
thing from you , please, explain your solution of the problem. 

Student It is simply the method of substitution > From the last 
equation it follows that x a — (k — k l x x — . . - fc n _, k n . Putting 
this value into the remaining linear equations, I get linear equations with 
n — I unknown only. After having solved this system, we calculate the 
value of x n by putting the values of x„ .. x a ^ in that equation Is it so ? 

Tutor. Yes — provided k n ^ 0 

St. In the case k Tl = 0, x„ is infinite 1 

/ 

T. I do not think so 1 — Eg. consider two equations and n— 2, 
k t = 0, say 


x x + 2x t = 5, x 1 + 0x 2 = 1. 
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The only solution of this system is obviously 2^=1, x t —2. 

St Yes, that is true — If k n — 0, then I take another equation of 
the system in place of the last one Thus without loss of generality, I 
suppose k n ^ 0 I think you will be satisfied 

T, Unless the coefficient of x n is zero in each equation of the system. 

St In this case the system has to be considered as a system with 
(n — 1) variables only , it would be absurd to consider it as a system of n 
variables as the equations are actually independent of x n 

T. Perhaps less absurd than you may believe, but I accept your de- 
finition that a system of linear equations should be considered to depend 
on such variables only, as have at least one coefficient different from zero. 

St Certainly 

T. After x n has been eliminated, how do you continue * 

St. 1 shall repeat the process again and again until I get one equation 
with one variable x v and then there is no problem left 

T. You suppose that the number of equations is equal to the number 
of the variables, and you believe that at every step of your procedure, both 
the numbers decrease by exactly one 2 

St Certainly, but the number of the equations may also be less than 
the number of the variables, let us say m < n equations in n variables In 
this case one puts the terms with a- m+ ,, , x n to the right hand side 

These variables may take arbitrary values For every set of values 
*m+i> • > x„, there exists one solution x v , r m , as there are as many of 
these variables, as there are equations The number n — m is the degree of 
freedom of our system, as the values of n — m variables may be chosen 
arbitrarily 

T And if there are more equations than variables ? 

St Then there cannot exist any solution It is obvious that n vari- 
ables cannot satisfy a system of more than n conditions 

T. But it seems to me that the system x = 1, 2x — 2 has a solution 
although it is a system of two equations with one variable 

St. But these equations are not different Equations which differ 
by a common factor only cannot be considered as different, and it is common 
sense to consider only such equations which are different. 
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T Thus, if two equal equations are given, one of them should be 
dropped 

St Yes 

T But this must be done also at the later stages of the procedure 

St I cannot follow you exactly 

T Consider . lx l — 38x 2 + 3x a — 1 3 

3x^*1“ 13x 2 4~2x3 — 17 
2a:,— 5x 2 4~ x 3 = 6 

3 different equations m x,, x, v x 3 Since the degree of freedom is zero, do 
you expect to get exactly one solution * 

St Yes 1 Put x 3 = 6— 2x l +5x 2 in the first two equations, then 

x, — 23x 2 — — 5 
— x l + 23x 2 = 5 

T These equations differ by a factor — 1 only , hence one of them 
must be dropped Thus you may choose x 2 in an arbitrary manner 
Put x 1 = 23 X., — 5, x 3 = — 4 lx, 4- 16, and this will solve the system of 
equations for every value of x 2 You have one “degree of freedom", al- 
though the number of the equations is equal to the number of the variables 

St That is true This example is obviously a wicked exception. 

T You may call it an exception if you like, but there are plenty of 
them 

St I see ' — Them may be certain cases, where the degree of freedom 
is higher than the difference between the number of the variables and the 
number of the equations, but at any rate n equations with n variables have 
at least one solution which can be found by the method of substitution. 

T Why * 

St Because the number of the equations can decrease, as one may 
get two equal equations by the procedure of substitution, and then one 
of them must be dropped, but the number of the variables cannot. 

T Try . 9x, - 15x s - 3x, = 13 

3xj + 10x 2 4- 2x„ = 1 
2x, — 5 x 2 — x 3 — 2. 
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St Substitute x„ = 2x l — 5x 2 — 2m the first and in the second 
equation 

Hence 

3z, = 7 

7x x = 5 This is funny 

T Indeed, the coefficients of x 2 in both the equations became zero ; 
thus the equations have to be considered as equations of one variable only. 
Now x j should be equal to 7 3 and to 5 7 , that is impossible 

St Perhaps I was somewhat rash in conceding that a system of 
equations in n variables should be considered as a system m (m— 1) vanables 
if the coefficients of one of the variables are all equal to zero. Let us retain 
x t and put 

3z, — Ox, = 7 

7x x -f Ox, = 5 Hence x t = Yi-Or,, and therefore Or, = 34.7, r,= co 

T What do you mean by oo 1 

St Infinity ' That is a number which is greater than every other 
number and equal to 1 0 

T Can you calculate with this oo as with an ordinary number ? 

St Certainly 

T Then — co = — 1 0=1 (—0) and for —0=0, — oo = oo holds 
Hence 0 = 2 oo , and therefore 0=oo 

St No, that is not so One cannot calculate with this symbol as 
with an ordinary number But, as a matter of fact, this oo occurs in ma- 
thematics It is a somewhat complicated matter, one needs differential 
calculus to handle it, and I was hoping that you may explain it to me 
clearly one day 

T On another occasion The symbol oo does occur , sometimes 
it is used rightly, sometimes wrongly, use and misuse, both are found in 
textbooks Considering systems of linear equations, we enquire about 
those numbers which taken for x x , , x n respectively, satisfy those equa- 
tions Numbers can be added, subtracted and multiplied , one can also 
divide a number by a number, unless the divisor is zero The division 
by zero is meaningless as far as numbers are concerned 
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St And the example which I attempted just before * 

T. It has no solution, since x i cannot simultaneously be equal to 7 3 
and equal to 5 7. 

St So there exist systems of 3 equations in 3 variables which have 
no solutions, systems which have an infinity of solutions, and systems which 
have exactly one solution How did you construct those examples by 
which you cornered me ! 

T It is not difficult if one knows a little bit of the theory. 

St Those vectorspaces, matrices, rank etc ? — Sir, I should be thank- 
ful if you could explain to me some portion of the theory without using 
those notions I do not like those mnovations 

T Then try the simplest case ax = b 

St. Then x — h a 

T Provided a =£ 0 

St If a — 0, b rjfc 0, the equation has no solution as there exists no 
number x, for which Ox — b 0 If however a = 0, b = 0, then every 
value x is a solution 

T Indeed 1 — This simple case is the seed of the whole theory. 
Now try a x x a 2 y — a 

b \x + b.y = b 

St A y = A,, Ax = A 2 , where A = afi 2 — b l a 2 , A, = afi — b x a, 
A 2 = ab 2 — ba 2 If A =£ 0, then x — A 2 - A, y = A 

T In this case there exists no other solution, and these values 
satisfy the given equations, as you may verify easily 

St If A = 0, but Aj or A„ is different from zero, there is no solution 
If A — A l = A 2 = 0, then every couple of values (x, y) satisfies the equa- 
tions 

T Consider x + 2y = 5, 3x + 6y = 15 Here A = A, = A 2 = 0, 
but e.g ( x , y) = (0, 0) is not a solution, as x = 5 — 2 y, 

St This is true, but I cannot understand it The equations Ax = A a ,' 
Ay = A t are satisfied by every pair of values x,y if A = A t = A a = 0. 
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T. These equations are consequences of the given equations , they 
are necessary conditions for solutions x,y of the given systems but they may 
not be sufficient ones The case A = A x = A 2 = 0 contains different cases 

(1) If all the coefficients are equal to zero, then every pair ( x,y ) is a solution. 

(2) Let a x = a z — b l = b. 2 = 0, but (a ,b) ^ (0, 0) then there is no solution 

(3) Let at least one of the 4 coefficients on the left side be ^0 Without 

loss of generality, a,^0 Put b x a, = \ 

Hence 0 = A = a x (b, — A. a 2 ), b 2 = \a 2 

0 = A x = a,(b — A a), b = \a 

(6j x + b. 2 y — b) = A (a t x + a 2 y — a) Hence x — (a — a 2 y) . a x for 

arbitrary y furnishes all the solutions There are therefore 5 different 
cases 

St. For a higher number of variables and of equations a full analysis 
may become very complicated How to tackle the problem for an 
arbitrary n * 

T For this, I propose to you to study the notions of vectorspace, 
rank, matrix, determinant etc , as explained in the following articles 

1-2 n-vector s 

Definition 1 An ordered set of n numbers is called an n-vector 

a = (dp U 2 , > ®n) 

The n numbers a defining a are called its coordinates As the set is supposed 
to be an ordered one, the n-vector will in general be altered by the inter- 
change of the coordinates 

Definition 2 The product of a number c and an n-vector a is the 
n-vector 

ca — feu j , cu,, , cu„) 

Definition 3 The sum of a and an n-vector P = (6,, . , 6J is the 
w-vector 

a + P — ( a i 4 " b lt -f b 2 , . . , a n + 6 n ) 

From these definitions it follows • 


« + P = P + « 

* + (P 4- y) = (« + /?) + r 


commutative law , 
associative law, 
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c(a ft) = c a - \- c fi 1st distributive law, 

{c l + c 2 )a = c x a + c i a 2nd distributive law 

As these laws hold, one can use the notation of sum of n-vectors in the 
same manner as it is used for numbers Thus 

2Cj = (2cja J j, , ScjO 1 ,,), where 

3 = 1, , m , Cj are arbitrary numbers, and a (a\, ,a l n ) are n- 

vectors 

The n- vector (— l)a is called the negative of a, and is denoted by —a The 


inverse of the addition of a, is the addition of — a As in elementary 
arithmetic, this operation is called the subtraction of a, and is denoted by 
the sign — Accordingly j3 — a is put for /? -f (— «) 

Notations 

0 = (0, 0 , . 

,0) 

zero-vector 


= (1, 0, 

,0) 

first unit-vector 


r 2 = (0, 1, 

,0) 

second unit-vector 


f- = (0, 0 , 

. 1) 

n th unit-vector 

Formulas a - 

ft 

It 

O 

Ci 

O 

II 

O 

ft 

II 

2«j 



The vectors of Plane Geometry can be considered as 2-vectors, those of 
Solid Geometry, as 3-vectors In Geometry, these vectors are added by 
putting them together m such a manner that the endpoint of the first vector 
is the starting point of the second one The result of the addition is the same 
as here , n- vectors occur also on other occasions Let e g n be the number 
of the depositers of a bank, and a v a,, , o n be their balances at a certain 

day , then the state of the bank on that day is represented by the n-vector 
(a v a 2 , , a n ) — a Similarly the alteration of the state on that day is 

represented by an w- vector fi, and the state of the bank on the following 
day is given by a -f f3 

1-3 V ectorspaces 

Definition 1. An n - vector a is said to be dependent on the n - vectors 

«i (j = 1, , m) if a can be represented by « = 2 Cj a‘ 

s 

Definition 2. The n-vectors depending on , . , a’", are said to form 
a vector space generated by 
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Proposition 1 If /?\ . , /? r belong to a vectorspace V, then every 
vector dependent on them belongs to V. 

Proof Let V be generated by a 1 , , a m , say = 2 then 

2 Cjj8 J =2 d u a k , where d k = 2 c^ 1 ,. Hence the proposition 

i k I 

Definition 1 can also be expressed m this manner o mtl is depen- 
dent on a 1 , . , a m , if there exists an equation 

Xk^ 1 — 0, where 1c mti ^ 0 (1) 

Definition 3 a 1 , , <* m+1 are independent, if 2&j = 0 implies 

— = & m +1 ” 

From this definition and the above remark it follows directly 

Proposition 2 m -f 1 > 1 w-veetors are independent, if and only 
if none of them is dependent on the m other ones , a single tc- vector is 
“ independent ” if it is different from the zero-vector 

Definition 4 A set of independent n-vectors generating a vector- 
space V, is called a basis of V 

Proposition 3 Every vectorspace V containing n - vectors ^ 0 has 
a basis 

Proof Let V be generated by a', , a'" If these n-vectors do 

not form a basis, then either they are all =0, or one of them, say a m is depen- 
dent on the other ones In the first case, V contains no w-vector -=fi 0, m 
the second case, V is generated by a 1 , , a m+1 Thus one can reduce 

the number of the generating n-vectors, till one gets a generating system 
of independent n-vectors At every step, the number of generating 
w-vectors decreases , hence after less than m steps the procedure cannot 
be repeated any more, i e the generating n-vectors are independent Thus 
they form a basis 

Proposition 4 Let « l , , a" 1 be a basis of V, /3 = Xc s a> and 

c m ^ 0, then a 1 , , is a basis of V 

Proof Since /3 is contained in V, it follows from prop 1 , that every 

w-vector of the vectorspace V' generated by , a m l , (3 is contained 

m V , on the other hand a”', and therefore every w-vector of V, belongs 
to V'. Hence V' = V To prove that the n-vectors are independent, we 
suppose that there exists a system of numbers d v . . . , <4, not ail 
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vanishing such that * 0 = d 1 a x d 2 a s + • • + d m _j « m 1 + d ra fi — 

(d t + d m c x ) a 1 + • + (dm-i 4- d ra e^) a 01 ’ 1 -f d m c m Since 

a 1 , , a ra_1 are independent, d m 7 = 0 , and furthermore c m 7 ^ 0 holds, the 

coefficients on the right hand side are not all vamshing, this contradicts to 
the supposition that a 1 , ...,a ,n are independent Hence the proposition. 

Proposition 5 Let V be a vectorspace, a 1 , . ,« m its basis, and 

/?*, be t independent n- vectors in V , then we get a new basis of V 

on replacing t suitable elements of the basis by the n-vectors /3‘ , thus 
t < m holds 

Proof (By mathematical induction) From prop 4 it follows that 
the theorem holds for <=1 Let it hold for l=r, without loss of generality, 
we suppose that /?', , /3 r , a r ", ., a" is a basis of V Thus /3 rtl 
= Cj ft 1 + • + c . /? r + c,+i «’ +1 + • + c m a m As /3 rtl is not dependent 
on /3 l , ., /? r , at least one of the numbers c Ttl , , c m is different from 

zero, say c r+1 7 ^ 0 From prop 4 it follows that a rn can be replaced 
by /3 lfl in the basis Hence proposition 5 

Proposition 6 Every basis of a vectorspace V contains the same 
number of elements , this number is called the rank of V 

Proof Let m be the number of elements of a basis of V, and t be the 
number of elements of an arbitrary system of independent n- vectors in V 
From prop 5 it follows that t ^ m Hence there exists one system of m, 
but no system of more than m independent n-vectors m V The number 
of elements of any basis is therefore equal to the maximum number of in- 
dependent n-vectors Hence the proposition 

Definition 5 If every n-vector of a vectorspace V' is an n - vector 
of V, then V' is a subspace of V This is denoted by 

V’cV 

Proposition 7 If V' C V, either V' = V, or rank V' < rank V 

Proof The n-vectors /3 l , , f3‘ which form a basis of V' are t indepen- 

dent n-vectors of V Thus it follows from prop 5 that t suitable n-vectors 
of any basis of V can be replaced by the /3’s Hence V has a basis 
)3’, . , /?', a |fl , . , where m — rank V Hence t g m. In the special 
case when t = m, the vectorspaces V and V' have the same basis 
.,/?*, and this implies V = V' 

To state that the subspace V' of V is different from V one uses the notation 

V' C V. 
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Proposition 8. The rank of a vectorspace of re-vectors is at most re. 

Proof The re unit-vectors generate a vectorspace which contains 
every re- vector Hence the proposition follows from prop 7. 

Proposition 9 Between p > re of re- vectors there exists always a linear 
equation with coefficients not all vanishing 

Proof. If the proposition is not true, there must exist p > re inde- 
pendent re-vectors, contrary to prop. 8 

Proposition 10 Let A be a system of re-vectors with the property 
that the sum of any two re- vectors of A as well as the product of any number 
and an w-vector of A belongs to A, then A is a vectorspace 

Proof. There cannot exist more than re independent re-vectors in A, 
say a 1 , . , a r are independent , then every re- vector of A depends on the 

a’s, but every re-vector depending on them must belong to A , thus A is 
a vectorspace generated by a 1 , ., a r 


Proposition 11. The re- vectors 


F = (1, 0, 

y 0, , 6 n _ r ) 


F - (0, 1, 

, 0, c lt ♦ , c n _ r ) 

(2) 

P r = (0, o, 

» » &rt~r) 


are independent 

Proof Suppose \ = Xdjfi 1 

Then A. = j ^rj *Zi> 

, q n -r)- 

Hence A 0, unless d x ~ d,— 

o 

II 

h 

"S 

1! 


1-4 Matrices The method 

of “Sweep out ” 



Definition A matrix M is a rectangular scheme consisting of rerei 
numbers called the elements of M which are arranged in m (horizontal) 
rows and n (vertical) columns The rows can be considered as re- vectors 
which generate a vectorspace R(M), and the columns are m - vectors generat- 
ing a vectorspace C(M). If every element of M is equal to zero, M is called 
the zero-matrix 0 

Consider e g. the right hand side of 1-3, (2) Here m = r The row- vectors 
are independent, hence rank R(M) = r , the first r column-vectors are also 
independent, and since the vectors are m- vectors, rank C(M) — r It will 
be proved that the ranks of those two vectorspaces are always equal, and 
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this number will be called the rank of the matrix M To get this result, 
some operations will be introduced which neither alter the vectorspace 
R(M), nor the rank of the vectorspace C(M) By these operations the matrix 
is gradually “swept out”, i e in a certain portion of the matrix, the elements 
are replaced by zero, and finally the matrix is reduced to a type which is 
similar to the matrix 1-3, (2) The method of “sweep out” is very impor- 
tant for the solution of systems of linear equations Let A be the matrix 



and let a 1 , , a"' be the w-vectors formed by the rows The vectorspace 

R(A) is obviously not altered by the following operations 

I Replace a‘ by ca\ where c=/= 0 (row-multiplication) 

II Replace a 1 by a J + da k (row-addition) 

III Omit a*, if a k = 0 (row-omission) 

Let aj, , a u be the w-vectors formed by the columns, and let 

= 0 (2) 

hold It will be shown that the same equation holds after any one of the 
operations I, II, III, has been performed on the matrix A Of course (2) 
can be expressed by 

Xg s a\ = 0, (2') 

for i=l, , , m 

Then for any particular j, k the equations ca J a = 0 

and vg* (a'g + da k s ) = 0 

hold , hence (2') remains invariant for the operations I and II The opera- 
tion III means only the omission of a condition which is identically satisfied 
Hence every linear equation (2) between the column-vectors is invariant for 
the operations I, II, III The inverse operation of I is an operation of 
the same type where c is replaced by c 1 , the inverse operation of II is 
an operation II where d is replaced by — d , the inverse operation of III 
is the addition of a new coordinate which takes the value 0 for every 
column-vector Hence, a linear equation (2) cannot hold after the opera- 
tion I, II, HI unless it held before Let r of the column-vectors 
be independent, and the other column-vectors be dependent on them, 
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then these r vectors form a basis, and rank C(A) = r By the operations 
I, II, III, those r vectors remain independent and the other vectors 
are dependent on them Hence the rank of C(A) is not altered The 
essence of these considerations can be formulated in the following manner. 

Proposition 1 By repeated use of the operations I, II, III the 
vectorspace R(A) and the rank of the vectorspace C(A) are not altered 

It may be noticed that the vectorspace C(A) will in general be altered, 
e g the number of the coordinates may decrease , on the other hand 
rank R(A) is invariant, since R(A) itself is not altered 

Theorem 1 By repeated use of the operations I, II, III, the matrix 
A^O can be transformed into 


1,0, 

, o *», 

>* \ 


o, 1, 

, 0 **, 

■: ) 

(3) 

0, 0, 

, 1 **, 




or into a matrix which differs from (3) by a permutation of the columns 
only (Asterisks are put for numbers of any value) The rows of this 
matrix form a basis of R(A) 

Pioof If any iow-vector is equal to 0, this row should be omitted 
Neither by I nor by II, the matrix can bt transformed into 0 Thus 
we will suppose that at every later stage of the operations given in this 
proof, every row-vector 0 will be omitted automatically , the matrix cannot 
be annihilated thereby Let a 1 t be different from zero, replace a' by 
(a'j)' 1 a’, thus a\ is made equal to 1 by the operation I , then replace 
a 1 by a 1 — a'j a 1 (operation II) for i — 2. 3, , m By this sequence of 
operations, the first column is “swept out” , i e one element (the first one) 
is made one, whereas the other elements are made zero , by the following 
operations the first row will not be multiplied by any number, and it will 
not be added to anj other row , hence the fust column wall remain “swept 
out” If a\ = 0, there exists a number y ,, so that a'^^0 , then we 
consider the column /; as the “first” column and we sweep it out accordingly 
Since in the proposition of the theorem a permutation of the columns does 
not matter, we may suppose without loss of generality that j 1 =l After 
the first column is swept out, we denote the elements again as in (1). 
Now o 2 1 = 0, and every row with vanishing elements is omitted, hence there 
is an element a 2 ;,, yt 0, for which j, > 1 Without loss of generality we can 
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suppose j 2 = 2 Again we sweep out the second oolumn on replacing a s by 
(a ! 2 )' 1 “ 2 and « k by a* — a k 2 a 2 for k^2 In this manner we can repeat 
the procedure till the matrix is either reduced to the form (3), or differs 
from it by a permutation of the columns only The rows of the matrix 
generate R(A) and as they are independent (see 1-3, prop 11), they form 
a basis of R(A) 

Theorem 2. rank R(A)=rank C(A) for every matrix A This number 
is called the rank of the matrix A 

Proof If A is the zero-matrix, then both the vectorspaces are of rank 
zero If A yfc 0, then we sweep out A , by these operations the ranks are 
not altered as shown by prop 1 In (3), the rank of both the vectorspaces 
is equal to the number of rows , the same holds for the matrices which one 
gets by interchanging the columns of (3) Hence the theorem 


1-5 Orthogonality Homogeneous linear equations 

Definition Two n - vectors a = («,, , a „) and [3 — (b u , b„) are 

said to be orthogonal if 


Sa, b, = 0 holds 


Thus if a is orthogonal to /?, then f3 is orthogonal to a , i e orthogonality 
is a symmetric relation This notation offers the opportunity to apply vec- 
tors to systems of linear equations Consider at first homogeneous equations 


«'i *1 + • • + a' n x n = 0 
a ra i *i + • + a'\ = 0 


( 1 ) 


For the matrix *( (a‘ k ) ) lts rows etc we use the notations of 1-4 Every 
ordered system of numbers 


I = (*i, , *„) (2) 

which satisfies the equations (1) is called a solution of (1) A solution is 
therefore an n-iector which is orthogonal to a 1 , . a m Let (2) be a solution 
of (1) and let Scj a ! n — (a lt . a n ) be an arbitrary vector of the vectorspace 

i 

R(A). As 


Za 1 , = 0 


holds for j = 1 m, 
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0 = Scj 2®\ x ( = 2 x, 2 <b a\ = 2 x t a,. 

j i i j i 

Hence | is orthogonal to a, I e 

Proposition 1 Every solution of (1) is orthogonal to every vector 
of R(A) 

If | is a solution of (1) and c is any number, then c£ is also a solution 
of ( 1 ) 

Let furthermore 

V = iVv • 2 /..) ( 2 ') 

be a solution of ( 1 ), then for j = 1 , , »«, 

0 = 5 a J , a:, + 2 a 1 , y, = 2 ® j i U, + y,) 
holds Hence £ + 17 is also a solution of (1) 

From 1-3 prop 10 it follows that the solutions of (1) form a vectorspace 
Every n-vector of this vectorspace is orthogonal to every n- vector of R(A). 
Hence 

Proposition 2 The solutions of (1) form a vectorspace X(A) Every 
n-vector of X(A) is orthogonal to every n-vector of R(A) 

To get the solutions of (1), one need only know the vectors which are 
orthogonal to any basis of R(A) Using the method of sweep out, one 
gets the basis m a suitable standard form 



(the asterisks of 1-4, (3) have been replaced by —b\,) or by a matrix, which 
differs from (3) by a suitable permutation of the columns Let 

/ i, . , n \ (4) 

\ « » hi J 

be this permutation. An n-vector (2) is orthogonal to the n-vectors of 
R(A) and is therefore a solution of (1) if and only if it satisfies the condi- 
tions 

*i, = t>’i + • • + K-r *i„ for } = 1, .... r. (5) 

The values of x, r+1> . . , x ln can be chosen arbitrarily, the remaining r co- 
ordinates of the n-vector are uniquely defined by them. 
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Put eg z lptl , x in = (1, 0, ..,0), 
then x h , , x h = (b\, b\, , & r „) 

Thus one gets n — r independent n-vectors of X(A) which have the coor. 
dinates, ordered corresponding to the permutation (4) Hencefx^, . , x t J 
take the following sets of values, 


(b\. 

• ,b\, 

1,0, 

0) 

(b\, 

>b\, 

o, 1, 

, 0) 

(b\- „ 


0, o, 

1) 


These n-vectors form the basis of a vectorspace X' of rank (n — r), and 
X' is a subspace ef X(A) In X', the coordinates x x , , , x^ take 

every set of values, but from (5) it follows that by these values a solution 
of (1) is uniquely determined Hence every solution of (I) is an n- vector of 
X', hence X' = X(A) This result will be formulated by the following 
propositions 

Proposition 3 rank R(A) + rank X(A) = n (7) 

Proposition 4 To solve (1), one applies the method of “sweep out” 
to the matrix A , let the result be a matrix, which differs from (3) by a per- 
mutation (4) of the columns only , then a basis of X(A) is given by (6) 

Every n-vector a of R(A) is orthogonal to every n-vector of X(A) 
The n-vectors which are orthogonal to the n-vectors of X(A) form a vector- 
space of rank n — (n — r) — r which contains R(A) as a subspace From 
1-3, prop 7 it follows that this vectorspace is identical with R(A) This 
result can be expressed by 

Proposition 5 If 5 a, x { = 0 is satisfied by every solution of (1), 
then (a v , a n ) is an n-vector of R(A) 

The propositions 1, 2, 3, 5 can be condensed into the following theorem 

Theorem Every system (1) of homogeneous linear equations generates 
two vectorspaces R(A) and X(A) for which (7) holds R(A) is generated 
by the rows of (1) , X(A) consists of the solutions of (1) An n-vector is 
orthogonal to R(A) [to X(A) ] if and only if it belongs to X(A) [ to R(A) ] 

It is important for the theory of homogeneous linear equations and its 
applications that the problem of solving such equations leads to two vec- 
torspaces which are connected by a reciprocity or duality R(A) defines 
X(A) in the same manner as X(A) defines R(A) 
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The advantage of the use of homogeneous systems is mainly based on this 
fact The treatment of non-homogeneous systems is more complicated 
One may apply to them the method of homogeneisation as it will be dore m 
the following article This method is used also in geometry, when ordi- 
nary coordinates are replaced by homogeneous ones, and thereby the space 
is extended to a projetive space The duality so obtained corresponds 
exactly to the duality between X(A) and R(A) 

1-6 Systems of non-homogeneous linear equations 
Given a system of linear equations 


a\ x l + • + al n = a\ 

a'“, x, + ■ x u = a m o 


( 1 ) 


If the terms on the nghthand side are not all equal to zero, the system 
will be considered jointly with the following two homogeneous systems 


a\ Vi + + a\ y„ = 0 


and 


a'"i y, + • • + <i'"» y„ = 0 , 

O/ y “t" • “t" ® n G 0 Z 0 ~ 0 


a n \ 2, + . + a\ =„ - rt m 0 z 0 = 0. 

The matrix of (2) is denoted by A, and the matrix of (3) by A 0 
propositions arc obvious, though important 


( 2 ) 


(3) 


The following 


Proposition 1 If | = (x v , x„) is a solution of (1), am 1 q = 
{y v , y n ) is a solution of (2), then I + 77 is a solution of (1) 


Proposition 2 If £ and I 1 are solutions of (1), then q = | — £* is a 
solution of (2) 


Proposition 3 
then | = Cj f 1 + 


If > £" aie solutions of (1), and c x -f . -j- c„ = 1, 
+ r„ | s is a solution of (1) 


Proposition 4 If (x t , , x„) is a solution of (1), then (x v r„, 1) 
is a solution of (3) 


Proposition 5 Let (z lf , z n , z 0 ) be a solution of (3) If z 0 = 0, 
then (z v . . , z n ) is a solution of (2) : if z 0 ^ 0, then {z l z u , ... v z 0 ) is 
a solution of (1) 



24 


ALGBBBA I 


The propositons 1 and 2 can be condensed into the following theorem. 

Theorem 1 If £ is an arbitrary solution of (1), then we get all the 
solutions of (1) by adding to £ the solutions t) of (2) 

As has been shown m the introduction, (1) may have no solution 
A necessary and sufficient condition for the existence of solution will now 
be established. 

Theorem 2 The system (1) has solutions if and only if rank A 
= rank A 0 . 

Proof Let r be the rank of A Since r is equal to the rank of the 
vectorspace generated by the columns of A, and A 0 is formed by A and a 
column added to A, rank A 0 is either equal to r or to r + 1 The solutions 
of (2) form a vectorspace X(A) of rank n — r Lot 

(y,> • . y„) (4) 

be the vectors of this space The vectors 

(Vv > y». 0) (5) 

form a vectorspace X' consisting of (n + 1 (-vectors w-vectors (4) are 
independent if and only if the corresponding vectors (5) are independent 
Hence 


rank X' = rank X(A) — n — r 

Also X' C X(A„), since every vector of X' is a solution of (3) Hence 

n — r — rank X' g rank X(A 0 ) = (» + 1) — rank A 0 , (6) 

where equality holds if and only if the vectorspaces X' and X(A„) 
are identical. But X' is identical with X(A 0 ) if in every solution of (3) 
the coordinate z 0 is equal to zero , from prop 5 it follows that m this case 
(1) has no solution Hence for n — r = n + \ — rank A„, that is, when 
rank A„ = r + I, the system (1) has no solution If X(A 0 ) contains (n -f 1). 
vectors not belonging to X', then it follows from prop 5 that (1) has 
solutions This holds if and only if n — r < ra + 1 — rank A 0 , t e. 
rank A < r + 1 Therefore m this case rank A 0 = r. Hence theorem 2 

To find out the solutions of (1), one may solve the homogeneous system 
(3) by the method of sweep-out and consider those solutions only for which 
z 0 = 1 holds The method of sweep-out leads to a matnx with n + 1 
columns and t = rank A 0 rows [ see 1-5, (3) ] of the type 
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1 , 0 , . 0 , — 6 1 ,, 


0, l,...,0,-b\ 


b\ 


0, 0, b \, .. ., — 6‘ n+ i-t 


This matrix corresponds to a system of homogeneous linear equations which 
is equivalent to (3) : 


— b \ 2 <Ui + • • + 2l n)1 

= 2 U+1 + •• + b 'n+l-t 2| n+1 


2 U = & 'l 2 b +1 + • + 6 ‘n +1 -t Z, Mt 

where i lt . , t ni t M+1 is a permutation of the indices 1, . , m, 0 If none 
of the indices t,, , * t is the index 0, then we can suppose without loss 

of generality that t n+1 - t is zero Then the system (1) is equivalent to 


= b\ *, t+i + + 6V, x ia + *>‘.^1-1 

x \i — 6' t a;, U] + + 6' tl _t x t„ + 6' 11+1 _, 


( 7 ) 


In this case, the equations are solvable, and therefore r = l holds The 
case of insolvability can therefore occur only if we cannot sweep out the 
matrix A 0 without sweeping out the last column of it, l e if there comes out 
a row in which every element except the last one is zero A row of this 
kmd corresponds to the condition z a = 0, and if this condition is satisfied, 
the system (1) has no solution Hence 


Theorem 3 On applying the method of sweep out to the matrix A„ 
m such a manner that the coefficients of z a remain the last column, either 
one gets a solution (7) or a row comes out which corresponds to an equation 
z 0 = 0, and which shows that the system (I) has no solution. 

From 1-5 and 1-6 it follows that the solvability of a system of linear 
equations does not depend on the number of the equations and of the un- 
known quantities, but on the ranks of certain matrices These ranks are 
limited by the number of the equations and of the unknown quantities as 
the rank of a matrix can neither exceed the number of the rows, nor the 
number of the columns. 


Proposition 6. If m = n, the system (1) has exactly one solution if 
and only if rank A = n. 
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‘ ‘Proof Rank A = n' cannot exceed n If n' < n, then either (1) 
has no solution or its solutions are m a (1, ^-correspondence to the solu- 
tions of (2) [ see theorem 1 ] and the solutions of (2) form a vectorspace of 
rank n — n' [ see 1-5, prop 2 and 3 ], and this vectorspace contains more 
than one element If rank A = n, then the solutions of (2) form a vector- 
space of rank 0, l e the system (2) has the trivial solution (0, , 0) only 

As rank A 0 cannot exceed n, rank A 0 = rank A , there exists therefore 
a solution of (1), but from theorem 1 it follows that thore’exists one solu- 
tion only. 

The coordinates of w-vectors and the elements of matrices have been 
supposed to be “numbers” No special supposition has been made whether 
this term should be understood as real numbers or as complex numbers Of 
course the preceding investigations are made in such a manner, that they 
are independent of any special supposition It may be mentioned in anti- 
cipation that the investigations up to here hold unaltered if the notion 
of number is replaced by the notion of “element of any particular’ 1 ' field”. 
The general notion of field will be explained in Chapter II, and will not 
be used before 

1-7 The method of orlhogonahmtion 1 

The numbers occurmg m this section are supposed to be real [ 
Especially the w- vectors are supposed to have real coordinates 

Definition 1 The scalar product of two n - vectors a — (a,, , a„) and 

ft — (6j, . , h n ) is the number 

a (3 = X a, 6 | 

1 

Fiom this definition follow 

1 a/3 — (3a commutative law 

2 a(p + y ) = «J8 -)- ay distributive law 

3 . aft = 0, if and only if a is orthogonal to (3 

4 aa > 0, if a 0 

= 0, if a = 0 

‘The field may also be finite , in this ease a vectorspace contains a finite number of 
elements only That vectorspaces are infinite sets of vectors, has not been used any- 
where ( see especially the Proof of 1-6, prop 6) 

t This section may bo omitted at a first reading 

JThe method can easily be generalised for abstract real fields i e. fields m which 0 
cannot be represented as a sum of squares. 
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Definition 2 The non-negative square root of aa is said to be the 
length \ a | of a 

Thus | a | >0, unless a = 0 That the length is a generalisation 
of the notion of absolute value, is seen from the ease n = 1, and also 
from the case n = 2, when « = (a, , a 2 ) is represented by the complex 
number a, + a, i This justifies the notation | a | Given a system of 
homogeneous linear equations with the matrix A To solve the system by 
the method of orthogonalisation, one forms a system of n independent 
w- vectors of length 1 

( 1 ) 

each of them being orthogonal to each other in such a way, that 

, F (2) 

is a basis of the vectorspacc R(A) generated by the rows of A The 
vectorspace generated by 

, /8" (3) 

contains only n-vectors which are orthogonal to the n-vectors of R(A), and 
these are solutions of the given system It will be shown how the 
n - vectors (1) can be found, and that (3) is a basis of X(A) By these 
considerations, one gets a second proof of 1-5, prop 3 

Definition 3 The n- vectors /?', , /?'" form an orthogonal system , 

if /?‘ /3 k = 0, for i Jc (4) 

=1, for i ~ k 

The n-vectors of an orthogonal system are therefore of length 1, and are 
mutually orthogonal 

Let 

F, . /> (5) 

be an orthogonal system, and 

a — % c, (6) 

then it follows that 

a j 8 k = c k , for k = 1, . , m. (7) 

Hence a — 0 implies c, = — c m = 0. Thus • 

Proposition 1 The n-vectors of an orthogonal system are independent. 
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Let k be an arbitrary n-vector ; put m ( 6 ) : 

c t = and A = « — a, ( 8 ) 

then it follows from (7) that 

A j 8 k = 0, for k = 1, . m. 

A is therefore orthogonal to the n- vectors (5). If A - 7 ^ 0, then put 

I A I' 1 A = /? m+ \ (9) 

then the n-vectors /9\ . /? m+1 ( 10 ) 


form an orthogonal system As is independent of the n-vectors (5), 
so are A, and k If A = 0, then k — a is dependent on (5) Hence 


Proposition 2 If « is independent of (5), and /? ra+1 is defined by 
( 6 ), ( 8 ) and (9), then (10) is an orthogonal system, and the vectorBpace 
generated by ( 10 ) is the same as the veetorspace generated by ft 1 , /?“, k. 
If A = 0, then k is dependent on (5) 

Proposition 2 can be applied for two different purposes Firstly . 
To extend an orthogonal system (5) to an orthogonal system (10) m such 
a manner that a particular n-vector k which is independent of (5) is de- 
pendent on (10) Secondly To state whether an arbitrary n-vector is 
dependent on (5) 


Let 


P 1 = (b\, , b\) 

p'" = ( 6 ”, . , b'\) 


( 11 ) 


be an orthogonal system , to find out if the unit-vector e k is dependent 
on (5), put m ( 8 ) « = e k , and therefore c, = e k fi' = 6 ‘ k ; hence 


«=S 6 ‘ k j 8 ‘ = (2 b\ b\, ...%V k b' n ) 

1 1 1 

Hence A = 0 if and only if 

2 b l k 6 ‘j = 1 , for k = j 
1 

= 0 , for k ^ j 

Hence e k is dependent on (5) if and only if the column-vector /? k of (11) is 
of length 1 and orthogonal to other column-vectors of (11). For m < n, 
one finds easily by this method a unit-vector which is independent of (5). 

To apply the method of orthogonahsation to a matrix A, omit the row- 
vectors which are equal to 0 . 
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Put ja’j- 1 a* = /?*, 

omit the rows of A dependent on /3 1 , 

put A = a 2 - (a 2 /3 1 ) /8S = |A| _1 , 

omit the rows of A dependent on /3 1 , /2 2 , 

put a' = a 3 — [(« s 0 l ) + («* j8 2 ) jB*], /?* = j A' l' 1 A' 

and continue till one gets the orthogonal system (2) of independent w- vectors 
which forms a basis of R(A) 

This orthogonal system can furtheron be extended by the help of an 
n- vector k which is independent of it One may choose this vector e g out 
of the unit-vectors The procedure can be repeated and stops after n steps 
as the n orthogonal ra-veotors 

P 1 ! --,P T , P r *\ 

are independent and form therefore a basis of the complete vectorspace 
of rank n 

An arbitrary n-vector can be represented by 

« — 2 c, p i , where a ft* = c k , for k — 1, , n. 

a is a solution to the system of homogeneous equations with the matrix 
A if and only if it is orthogonal to R(A), i e if it is orthogonal to 
fl l , . , /3 r Hence a is a solution if and only if 

Cj = . =c r = 0 , 

hence the solutions are the n-vectors dependent on 

0 r+ \ • ,P a 

By this result 1-5 prop 3 is proved without reference to the method of sweep 
out. 


The method of orthogonalisation has some advantage over the method 
of sweep out, sis it furnishes bases of the vectorspaces R(A) and X(A) which 
are orthogonal systems, but it is not very convenient for practical calcu- 
lation "Sweep out” needs only rational operations whereas for ortho- 
gonalisation, a square root must be drawn at every step. 

1-8. Determinants. 

Let A = ((a 1 *)) be a square shaped matrix with n rows and n columns It 
is possible to allot to A a certain number which therefore is a function of the 
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matrix One may consider this value also as a function of the n- elements 
of the matrix or as a function of the n column- vectors, for when the elements 
or the column-vectors are given in their proper order, the matrix is given 
too Obviously functions of n 2 variables can be formed in an infinity of 
different ways The particular function which will be considered here, is 
called a determinant, and it is denoted by 

a\, . , a l „ 

det A = det ((a\)) — — det (« I( , <*„) (1) 

a n j, , a" n 

One may consider a lt , « n as an abbreviation put for the vertical column- 
vectors of ((a' k )) , then the notation given on the right hand side becomes 
identical with the notation on the left It is however useful to express the 
determinant as a function of the column-vectors, since the determinant will 
be shown to be a linear function of those m-vectors 

The determinant is supposed to have the following properties 
(a) det (a x , . C a m , , a„) = c det (a i; , a n ) 

{b) det (« i; , « m + « kI = det (a,, , «„), for k^m 

( c ) det (f\ , t”) =1 

It is not obvious that there exists a function which has these properties 
It may be that these contradict one another If e g in (b) the restriction 
k yfc m is omitted, the conditions contradict one another, since from (a) 
and (c) it follows that det (2P, , e n ) = 2, whereas from (ft) and (c) it 
would follow that it is equal to 1 We assume at first that such a function 
exists and derive its properties , existence and uniqueness will be proved 
later on 

Projwsilion 1 If anyone of the n-vectors , a„ is equal to 0, 

then det A is zero 

Proof Let a, = 0, and therefore a, — 0 , 

det A = det (aq, , 0 a i( . , a n ) = 0 det AfO 

Proposition 2 det A is not altered when a k is replaced by <* k -j~ ca,, 

for i =£■ k 

Proof For c = 0, the proposition is obvious, for 0, det A 
= ~ det ( cari, . a k , . . ) = det (. , ca,, . . , a k + ca„ ) — 
det ( a k + c . . ) 
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This proposition shows that “column-additions” do not alter a determinant 
where column-addition is understood in the same sense as “row-addition” 
was in 1-4 If any column is dependent on the other columns, say a ti = 
c, o ( + . + c„_, one can reduee it by n — 1 column-additions to 

the n - vector 0 From prop 1, it follows therefore 

Proposition 3 When the column- vectors are dependent, the deter- 
minant is equal to zero 

Another consequence of prop 2 is 

Proposition 4 If two columns are interchanged, the determinant 
changes its sign 

Proof det A = dot ( , a u , a k , . ) 

= det ( , , « k + ) 

= det ( , a, — (« k -f a,), .,a k 4-a,, ) 

— det ( , — a,„ , a„ ) 

= — det ( , ot k , , ) 

From this proposition follows 

Proposition 6 An even pel mutation of the columns does not alter a 
determinant , an odd permutation afters a determinant to its negative 

Proposition 6 det ( e'i, , f'» ) — |- 1, oi = — 1 according as the 
permutation t,, , i„ is even or odd 

A determinant with two equal columns is equal to zero , this is a special 
ease of prop 3 (it is also a consequence of prop 4) 

Pioposition 7 Let B be the xnatnx which one gets by replacing a 
particular column of A by /?, then 

det A + det B = det (oq, ,<*, + / 3, , a n ) (2) 

Proof Let the n — 1 column-vectors a k (for k ^i) be dependent, then 
the three determinants occuung m (2) are zero each and the formula holds 
Let a, depend on the n— 1 n-vectors a h , then det A=0, and the determinant 
on the right hand side of (2) can be reduced by column-addition to det B 
Without loss of generality, we can therefore suppose, that a,, , a n are 

independent , they therefore form a basis of the vectorspace containing 
every n - vector , hence /? depends on them, say /? = c : cq + + c n a n 

By column-addition, it is possible to reduce the column /8 in B to c, a„ 
and similarly in the determmant on the right hand side of (2) Hence 
both the sides are equal to (1 + det A Hence the proposition. 
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Let B s be the matrix one gets by replacing «, by /?„ and let 
OTj =s C 1 p x 4- •• - + c t Pi J 

then 

det A - det (a v c, 0„ . . , «,,) + det (a,, .. ,c t 0 1 , ■ • » O 

= det K, «n) + c t defc B ‘- 

By repetition of this procedure one gets 

det A — c t det B* + • • + c t det ^ 

This formula can be expressed as follows : 

Proposition 8. A determinant is a linear function of each of its column- 
vectors. 

Consider especially ® i) — ^ a i E and 

A\ = det («„ . . , <*,-», * k > «m ••>“»)> 
then we get from prop 8 

Proposition 9 det A = 1 a* t A i ^ 

The number A k , is said to be the cofactor of o k , If A, is the n-vector A t 
_ (A 1 ,, A 0 ,), then det A is the scalar product 

det A = X Ai 

Especially A\ = det (e k , ° 2 > •> 

an d det A ~ % a,\ det (e‘, a,, «„)• 

By applying prop 9 to A k j, one gets 

A\ = S det (e k , e", “ 3 > •> “n)> and therefore 

ft 

det A = £ a\ «* s det (* k , «n)- 

k>« 

By repeating this procedure n-times one gets : 

det A = S a\ o' 2 , .... det e*. .... **)> ( 6 ) 
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where k, s, q take the values 1 , . . , n independently The determinants 
on the right hand side take the values + 1, — 1, 0 according as k, s, q 
is an even permutation of 1, , n, or an odd permutation of them, or the 

indices are not different Hence 

Proposition 10 If there exists a function det A satisfying the condi- 
tions (a), (b), (c), then det A must be equal to 

D(A) = S ± a\, , «V (7) 

where the sum has to be taken over all those products a k 1 o' 1 ,, for which 
the upper indices are permutations of 1, , n , the sign -)- being used for 

even, — for odd permutations 

Theorem 1 The function det A satisfying the conditions (a), (6), 
(c), exists and it is equal to the function D(A) as defined by (7) * 

Proof From prop 10 it follows that det A is either non-existant 
or it is equal to D(A) To prove the theorem, it must therefore be shown 
that D(A) satisfies the conditions (a), (b), (c) Thus (a) If a, is replaced 
by ca t , then in every term of the sum (7) exactly one factor is multiplied 
by c , hence D(A) is replaced by cD(A) (e) If aq — i 1 for j = 1, , n, 

then a‘ } = 1, a' k = 0 for ; k Hence , a 11 ,, = 1, whereas the 

other terms in the sum (7) arc equal to zero Since 1,2, , n is an even 

permutation, I)(A) = + 1 (h) To prove that the condition (6) holds for 

D(A), we prove at first that D(A) satisfies prop 4 If m A the lower indices 
i and k are interchanged, the terms m (7) are not altered, but every even 
permutation becomes odd and conversely , hence D(A) is transformed into 
D(A) If a, = a h , the exchange of t and k cannot alter D(A) , hence D(A) 
— — D(A) = 0 If in A the column-vector a,,, is replaced by a m - a k , m 
every term a\ a' 1 ,, the factor a r m is replaced by a\ + « r m > and the 
term is therefore increased by n k 1 . o r k a n r The sum of these 
additional terms taken with the corresponding sign ± is equal to the 
determinant which is got when a m is replaced by a k 

Hence D(A) is replaced by D(A) + det {(ij, ..,a k , . , a k , . , a n ) = D(A) 
Hence the theorem 

The propositions 1 to 10 which have been established under the 
supposition that a function satisfying the conditions (a), (b), (c) exists, hold 
therefore unconditionally It follows furthermore from prop 10 that this 
function (the determinant) is uniquely determined 


Proposition 11 X AS = 0, for j ^ t 


( 8 ) 
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Proof If a l is replaced by cq in A, the determinant is zero. Hence 
(8) follows from (5) 

Proposition 12 Let a lt . a„ be the column-vectors, a 1 , .. ., a n be the 
row-vectors of A, then 

det (a v a u ) = det (a 1 , ., a*) (9) 

Proof In every term a k 1 . a“ m of (7) the factors can be ordered in 
such a way that the upper indices have their natural order 1, , n , then 

the order of the lower indices is the inverse permutation of h, q A 
permutation is odd (even ) when its inverse permutation is odd (even) Hence 

det (A)= 2 ± X a'j a V 

where j, . , p takes all the permutations of 1, , n, and the sign ± has to 

be taken according as the permutation is even or odd Hence det (A) — 
det (a 1 , , a n ) 

From this proposition it follows that in every proposition which holds 
for determinants, we may exchange the column-vectors and the row-vec- 
tors (l e the upper and the lower indices) This duality of rows and columns 
in a determinant can be extended to the cofactors A k , by the help of the 
following proposition 

Proposition 13 If one replaces in the matrix A the element a k ) by 
the value 1 and those elements which are m the same row or m the same 
column as a k , by 0, the determinant of this matrix is equal to 

A\ = det (or,, ,«!_!, e\ « i+ i, ,<*„) = det («\ , a k ~\ e\ o k+1 , . , a„) 

( 10 ) 

Proof Let a, be replaced by e", then every term a\ . a B , . o q n 
of (7) is zero unless g = k Sm c e therefore in every non-vanishing term the 
factor a occurs, no factor a k j, j i can occur. Hence A k i is independent 
of the elements a k ,, which may therefore be replaced by any value, eg 
by zero In exactly the same manner it can be proved that if in A the row a k 
is replaced by the w-vector r 1 , the determinant becomes independent of 
a*„ for g ^ k Hence the proposition. The essence of some of the pro- 
positions proved in this article is given by the following theorem and the 
subsequent formulas 

Theorem 2 det A is a linear function of its row-vectors (its column- 
vectors) It is invariant to row-addition (column-addition), to even per- 
mutation of the rows (columns) and to the interchanging of the rows with 
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the columns with the corresponding index det A changes its sign only, 
if an odd permutation of the rows (the columns) is performed If rank A 
< n, then det A = 0 

det A = det (o^, . , <*„) = det (a 1 , . a”) » 

= 2 ± a\ a\ = S ± a\ a\ 

% a k j A k , = X « J k A' u = 0 for j ^ i (11) 

= det A for J — i 

1-9 The Minors of a determinant 

Again, let A be a matrix with n rows and n columns It has been proved 
in 1-8 prop 3, and theorem 2, that if the columns of A are dependent and 
therefore the rows are dependent, then det A = 0 It is important to know 
that the converse holds too, i e that if the determinant is zero, the columns 
(and the rows) are dependent Of course if det A = 0, then 1-8 (11) shows 
that X « k A k , = X “k A' k = 0, for t = 1, , n Hence the columns (the 
rows) are dependent , only m the ease when every A k j is zero, this con- 
clusion fails To get a general proof, one has to go somewhat deeper into 
the matter Consider 


A\ — 2 ± o', a*, a“ n , where a\ = 1, since ^ ^ ^ is an 


even or odd permutation according as ^ j’ ' g ) 


is Hence 


A\ = X ± a \> 

where -f has to be taken for the even permutations of 2, , n, and — for 

the odd ones Hence A 1 , is the determinant of the matrix which is ge- 
nerated by striking out the first row and the first column of A Similarly 
the determinant A k ( is generated by replacing a k , by 1 and putting 0 for 
the other elements m the jfc th row and for those m the i' h column The 
k th row may be interchanged by a simple transposition with the (k — 1)*', 
then with the (k — 2)" J etc Thus by k — 1 transposition the k lh row is dis- 
placed to the first place, the relative order of the remaining rows not being 
altered By this operation A k , takes the factor (— l) k_I Subsequently 
the r th column is moved to the first place, and A k ( is therefore replaced by 
(— l)“ +k A k |. Then the first element of the first row is equal to 1, whereas 
the other elements of the first row and of the first column are equal to zero 
As has been shown above, these two lines can be omitted without altering 
the determinant. Hence • 
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Proposition 1 {— l) l+k A k , 18 equal to the determinant which is 

generated when the row and the column which intersect m a*, are both 
omitted 

Definition Let B be a matrix with m rows and n columns If one 
omits m — r rows and n — r columns, the determinant of the remaining 
square-shaped matrix multiplied with e = ± 1 is called a minor of B of 
order r 

Eg let A be a square-shaped matrix , its minors of highest possible 
order are det A and — det A From prop 1 it follows that the cofac- 
tors of the elements of a square-shaped matrix are minors By permuta- 
tions of the rows and of the columns, a minor is transformed into a minor 
as this permutation means a multiplication with ± 1 only Let k 1 , .. , k r 
be r < n different numbers If <* 1 ^ •, “i r are replaced in det («,, , <*„) 

by £ k i, , £ k i , one get3 a determinant 


By applying r times prop 1, it follows 

Proposition 2 The determinant (1) is a minor of A, and it is genera- 
ted by omitting in A the rows k ± , , k r and the columns i v , i r , the 
determinant of the remaining matnx multiplied by e = ± 1 is equal to 
the determinant. (1) e is equal to -f- 1, or — 1 according as *, + . + i r 

+ fcj + + /c, is even or odd 

As det A is a linear and homogeneous function of minors of order n — 1 
[see 1-8, prop 9], detA= 0 if every minor of order n— 1 is equal to zero 
Similarly, if every minor of order n — 2 is equal to zero, the same holds for 
every minor of order n— 1 and therefore for det A By repeated applica- 
tion of this consideration one gets the following result 

Proposition 3 If every minor of order m is equal to zero, then the same 
holds for the minors of higher order 

The rank of a matrix is equal to zero if and only if every element is 
equal to zero, 1 e if every minor of order 1, and therefore every minor is 
equal to zero The connection between the rank of a matrix and the 
maximum order of non-vanishing minors will be investigated now 

Proposition 4 Let B be a matrix with m g n rows and n columns. 
If a minor of B of order m is different from zero, rank B = m 
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Proof Let the minor composed of the m column-vectors a, . , p 
be different from zero , then these m-vectors are independent [see 1 -8, 
prop 3] Hence rank B = rank C(B) = m 

If m> n, and a mmor of B of order n is different from zero, the matiix 
can be transformed by interchanging of rows and columns to the case consi- 
dered in prop 4 Hence rank B = n 

Theorem If B has a minor of order r which is different from zero, 
but every minor of higher order (if any) is equal to zero, then rank B = r 

Proof Since the rank of a matrix is not altered by permutations of 
rows and of columns, we suppose without loss of generality that the deter- 
minant formed by the rows 1, , r and the columns 1, , r is different 

from zero Thus the matrix formed by the rows 1, . , r 

a\, , a\, , a\ 
a\, , a\, , a r „ 

is of rank r, and the row-vectors a 1 . , a r are therefore independent Con- 
sider the matrix formed by « l , , a r , « v , r < v n, and for any particular, 

r, consider those minors of order r+1 which contain the columns 1, , r 


O j , 

. a 1 ,, 


, a\ 

a 'v 

, « r ,, 



a\. 


» ^'il» 

> a 1 ,, 


Each of these minors is equal to zero, and the cofactors, say 

A', , A r , A' of 

d ll> > H r il> & u 

have the same values for every u , they are minors cut out of the columns 
1, .. , r In particular 

a\, . a\ j 

A v = 1=^0 

a\, , a\ | 

Hence A 1 a‘ u -f -f A' a r k -j- A v a\ = 0 

holds for k — 1 , . , r, , n Hence 

A 1 a 1 + . -f A' a r -f A v a v =0 



38 


ALGEBRA 1 


Since A v ^ 0, the w-vector a y is dependent on a 1 , . a T This holds for 
v — r + 1, . , n Hence a 1 , ..,a r form a basis of the vectorspace 
R(B) generated by the rows of B. Hence rank B = r 

If therefore A has n rows and n columns and det A = 0, then 
rank A < n, i e the rows (columns) are dependent 

1-9 Generalised cofactors * By the formula det A = 2 a\ A* k , the 

k 

determinant is expressed as the scalar product of the w- vector (a\, . . a' n ) 
and the w-vector (A'j, , A' n ) , thus det A is represented as a function 

which is linear in two different sets of variables ( bilinear function), one set 

consisting of minors of order 1, the other set consisting of minors of order 
w — 1 This representation can be generalised to a representation as a bi- 
linear function by one set of minors of order m and one set of minors of order 
n — m For this purpose the indices 1 , , w are subdivided into two 

portions 1, , m and t, , n where 1 < to, and to + 1 = t < w Every 

term of det A = 2) ± a 1 , a" p can be represented as the product of 
two terms 

± a\ . a n v = (- l)e a\ a"\ (- 1)5 a' r , a n h „ (1) 

where e + 5 is even or odd according as r, , s, r\ , s' is an even or 

an odd permutation of 1, , w Every term generates a partition of the 

lower indices into two classes, one class r, . , s of m elements and one 
class r ' , . , s' of n — m elements 

There exist C) such partitions , each partition corresponds to 
to 1 (w — to)' terms of the determinant, as the elements of the first class 
admit to 1 permutations Of course^ ^ ^ to' (w — to) 1 = w 1 is the number 
of the terms of the determinant Consider at first the partition, where 


r, , s = 1 , , to 

r', ,8'=t, w, ( = to + 1. 


( 2 ) 


* This section may be omitted at a first reading 

{ The expansion of a determinant given in 1-9, (5) is named after Laplace- 
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Put e = 0 or = 1 according as the permutation of r, a is even or 
odd Since e + g is even or odd according as r, , . , s, r', . . , a' is an even 
or an odd permutation, S is even or odd according as r', . , s' is even or 

odd The sum of these terms is equal to 

1> • » > n t » > m 

2 (- l) f a\ . a'\ 2 (- l) 5 • a\, = A A , , (3) 

r> >s r j 

An arbitrary even permutation of the lower indices of the product on the 
left hand side of (1) transforms this term into another term of the deter- 
minant without altering the sign + or — . Let now 

r, , s, r’, . ., s' (4) 

be an even permutation of the indices 1, , n, then the teims 

(- l)* + « a\ a"' s a' r , • a" s , 

which one gets by multiplying the two sums on the left hand side of (3) 
are terms of the determinant, each with the correct sign -j- or — , If by 
the permutation of the indices, dashed indices are exchanged with non- 
dashed ones, none of these terms will be a term of (3) Therefore one gets 
the n ' terms of the determinant with the correct signs each once and only 

once m the following manner One performs all the ^ ^ different parti- 
tions of the indices 1, , n into m indices without dash and n—m indices 

with dash , one arranges the dashed and the non-dashed indices in such 
a manner that (4) is an even permutation, and one forms the product (3) 
This product contains the m\n-~mY terms of the determinant corresponding 
to the partition The sum of all these products is equal to the determinant. 

Hence 


det A = 




n 


where the sum has to be taken, as explained above 


(5) 


Since an even permutation of the rows does not alter the determinant, 
the upper indices 1, . ,m,t, . ,n (where t = m + 1) can be replaced by 

any particular even permutation of these terms. This permutation must be 
the same for all the terms of (5). If the permutation is an odd one, the sum 
(5) is equal to — det A , if the upper indices are not all different, the sum 
is equal to zero 
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1-10 Solution of systems of linear equations by the help of determinants 


Consider m linear equations with n variables 

a\ x L + . + a\, x „ = a\ 

a " i Zj + + a\ x n = a\. 


( 1 ) 


Multiply the equations with the cofactors of the k ,h column and add , then 

x k det A = X a 'o A‘ k = det (a,, , a 0 , , a„), where in the determinant 

1 

on the right hand side, « 0 stands on tlie k' h place If det A 0, 

x k = det (oq, , , «„) det A, (2) 

for k = 1, , n 


This condition is necessary, but as there exists a solution and x lt , x n is 
uniquely determined by (2), the condition is sufficient too Verify this 
result by putting (2) into (1) 

v a‘ u x a = { X X a'o A’ k } det A — { X X « J „ A' 0 } det A, 

k k 1 i 

but 2 a J lt A' k — det A for j — i 

k 

= 0 for j i , hence 

X ffl'k x k = a\, 

1 

As this result holds for ; = 1, , m, the equations (1) are satisfied by (2) 

Foi the case when the number of the equations is different from the number 
of the unknown quantities, the method needs some modification Tf the 
rank of the full matrix is greater than the rank of the homogeneous portion, 
the system has no solution Let the two ranks be equal, and equations 
depending on the other ones be omitted The rank of the matrices is there- 
fore supposed to be equal to the number m of the equations 

a \ x i + • + «'»■ x m + -F o‘ n ~ a 1 „ 

( 3 ) 

a'\ x + 4- a m m 4- . . + a m „ * m = o”„ 

There exists therefore a determinant formed by m columns of the coeffi- 
cients on the left hand side which is different from zero Without loss of 
generality, suppose that it is the determinant 

a\, . . , a} m 

a m v - - •» a n, m 


= d =F 0 
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Multiplying the equations with the cofactors of the jfe tfc column (k—l, .. .,m) 
and dividing by d, one gets 

X k — dk, o + d k , m +1 + • • • + d n X a , k = 1, .. m, (4) 

where d k , „ = det(a X) « m ) 

<k> H = — det(a 1 , a /i, a m ), (i = w -f 1, . n. 

In the determinants on the right hand side, a 0 and are supposed to be 
put in the k tb place. 

1-(10) 1 Comparison of the different methods for solving systems of linear 
equations Three methods for solving linear equations have been discussed : 
“Sweep out”, “orthogonalisation”, and “determinants” , furthermore the 
method of “substitution” has been mentioned in the introduction Common 
features of these methods are the following 

(1) A given system of linear equations, say 1-10, (1) or (3) is replaced 
by another one which either gives the solution, if there exists one only, 
[ see 1-10, (2) ] or shows a method of finding any number of solutions 
[ see 1-10, (4) ], if there exist more solutions than one. 

(2) The derived linear equations are homogeneous linear combinations of 
the original equations, such that if the original equations are satisfied, 
the derived equations hold I e the derived equations are necessary condi- 
tions. 

(3) The original linear equations are homogeneous linear combinations of 
the derived ones Hence the derived equations are sufficient conditions. 
By the method of “sweep out” as well as by the method of substitution, 
this reduction of the given linear equations is done step by step. Consider 
the method of substitution 

a(x) = Uj x ^ -f- * • -f- a n x n a 0 — 0 

h(x) = fq x j -j- . x n h 0 = 0 

k{x) = k ^ x^ ■ j- . - • k a x n k 0 = 0 

The symbols on the left hand side are only abbreviations for the Kneay 
functions in the centre. 
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If aj 4 0, x t = [ a D — a 2 z 2 , . — a,: x n ] a } By putting this 
value into the other equations, we get equations of the type 



This equation is identical with b(x) — - - — a(x) — 0 Thus this “putting 

in” means the same as the “sweep out” of the first column By the help of 
one of these new equations, x, is represented as a function of x 3 , . x n , 
this value is put into the remaining n — 2 equations , thus the second 
column is swept out in n — 2 rows The substitution leads after n — 1 
steps to a system of the following type (provided the rank of the matrix of 
the homogeneous portion is equal to n) 

a x x i + + a„ x n — a u — 0 

C 2 X 1 + + C n — C 0 = 0 

The matrix is swept out below the diagonal By putting x n —t u t n into the 
other equations the column is swept out The (n— l)’ 1 equation is 
transformed to s x n . i — s 0 = 0 , putting x, v 1 s into the other 

equation the ( n — l)’* column is swept out, etc Finally the matrix is 
completely swept out and the values of x t , , x„ are determined 

The method ol substitution is therefore not essentially different from 
the method of sweep out The method of determinants does not use a 
procedure by steps One determines numbers A, B, . , K such that 

A a(x) + B b(x) + + K k(x) 

is independent of x 2 , , x n , and is therefore of the type 

u x x — v = 0, or x x = v • u 

This condition is necessary, but it may not be a sufficient one In the 
preceding sections it has been shown, that by the method of determinants 
one gets necessary and sufficient conditions for the unknown quantities 
x v . x n , in a suitable form, provided the fundamental condition for the 
ranks of matrices holds [ see 1-6, theorem 2 ] For numerical calculation 
it is sometimes useful to use methods of elimination and of “sweep out” 
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jointly. When one proceeds on this way, it is advisable to consider very 
carefully whether the necessary conditions stated in this manner are also 
sufficient. 

In general, the methods of determinant and oi orthogonalisation are 
not very suitable for numerical calculation To calculate a determinant, 
it is in general not advisable to determine the n ' terms of 1-8, (7), but 
to simplify the determinant by sweeping out the matrix below or above the 
diagonal The value of the determinant is then equal to the product of 
the elements in the diagonal A determinant can be swept out by row- 
addition as well as by column-addition In some cases it is useful to cal- 
culate a determinant by the help of 1-8, (5) 

l-(ll) Linear transformations In the preceding sections, the n-vector 

£ ~ (*^i> ? ***n) (1) 

has been considered as an unknown quantity, whereas the coefficients were 
supposed to be given numbers For many applications of the theory (e g 
application to Geometry), it is necessary to investigate the mutual connec- 
tion between the numbers, w-vectois and matuces occurmg in these for- 
mulas. 


Consider a 1 l x x -f- -j a‘ :l x„ — y v 

a'\ x, + . + a\ x„ = y„ , 

then to every n-vector |, there corresponds an n-vectoi 

7 = (y v . y») 

This correspondence will be denoted by an arrow 


( 2 ) 


(3) 


f - V (4) 

This formula may be read A transforms £ into y The formula (2) is 
called a linear transformation From (2) it follows (the same notations 
as m previous sections being used) 




0h>0 


(5) 



e k -» a k 


(6) 

If 


£l “Ml- 3 = 

1, 2, .... 


then 

c i li + • 

• • + Cm TJ 1 -f . 


(7) 
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Hence, if £ takes all the n-vectors of a vectorspace V, the corresponding 
vectors form a vectorspace V', as from £ ij, £' — > 17 ' it follows, that 

£ £' — » ij -j- r\' and c £ — > c 17 [see 1-3, prop 10 ] A system of 

dependent n-vectors is transformed into a system of dependent n-vectors, 
but the converse may not hold 

Let V be a complete vectorspace of rank n, then £ = (x v . . , x D ) 

= x 1 e l ... + x n e“. Hence £ -» ajj -f- ... + x B a n . Every vector 

of V' can be represented in this manner, and x x , . , x n take independently 
all values, V is therefore generated by at 1 , . , «„ and rank V — rank A. 
Hence . 

Theorem 1 By a linear transformation (2) a vectorspace is trans- 
formed into a vectorspace and the vectorspace of rank n is transformed into 
a vectorspace of a rank equal to rank A. 

The notion of linear transformation can fully be characterised by the 
manner how sums of vectors and products of numbers and vectors are trans- 
formed This fact is shown by the following theorem 

Theorem 2 If the n-vectors (1) are represented by n-vectors (3) in 
such a maimer that to the sum of two vectors there corresponds the sum 
of the corresponding vectors and that to the product of a number c and 
an n-vector there corresponds the product of c and the corresponding 
n-vector, then the representation is effected by a linear transformation. 

Proof. Let <* k = (a’ k , . , a\) be the n-vectors which represent the 

unit- vectors e k (k = 1, .. ., n) Then x k e k is represented by x k a k , and 
£ = (* 1 , , x a ) == X x k E * 18 represented by X x n « k , i e by the n-vector i\ 

which is determined by (2) and (3) Hence the theorem 


1-(1 1)1 Composition of transformations. Product of matrices Let 
£ be transformed into q by l-(ll), (2) and q be transformed into £ by 
another linear transformation 


b\ y t + . + y a = Zj 


• • + t>\ y n = z u 


( 1 ) 


The matrix ((fc k j)) will be denoted by B 


Then 


= 2 & k , Vi — 2 & k j o'k x . = 2 fif k « x h, 

i J.B 


where g k t = X & k j 


( 2 ) 



COMPOSITION OP THAN SFOBM ATION S 


45 


Thus £ is transformed into £ by a linear transformation which is said to 
be composed of the transformations | — r r) and y -» £ The matrix 
((g k a)) = G is said to be the product 

G = B A 

To get the elements g k s of G, one must multiply the elements of the k lh 
row of B with the corresponding elements of the s th column of A, and add 
the products In terms of scalar product [see 1-7] . 

0 k ■ = “»» 

where denotes the & ,h row-vector of B. 

In general A B and B A are different matrices , 1 e the commutative law 
does not hold for the multiplication of matrices 

Let C = ((c* k )) be an arbitrary matrix 

CB = H = ((A‘j)), where = 2 c‘ k 6 k j 

k 

HA = P = ((*>*.)) ” p*. = 2 h ' j a J „ = 2 c‘ k b\ 

J )>k 

CS = Q = ((8*.)) ” q', = 2 c' k g\ = 2 c' k b\ 

j,k 

Hence P = Q, i e 

(CB)A -= C(BA) (3) 

This formula can be expressed as a theorem 


Theorem For the multiplication of matrices the associative law holds 

1-(1I)11 n-vedors considered as matrices It is often useful to consider 
an w-vector 

£ — ( x l> * > X n) 

as a matrix, eg as a matrix 


(*) = 


x i 0 , . . , 0 

*2 0 , ..,0 

x a 0 , . , 0 



( 1 ) 


where the first column is formed by the coordinates of f , whereas the other 
elements are equal to zero If we multiply such a matrix from the left 
hand side with an arbitrary matrix with n rows and columns, the result 
is again a matrix of the type (1) The linear transformation l-(ll), (2) can 
therefore be expressed as an equation for matrices 
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Similarly 1-(1 1)1, (1) is eqm valent to 

B(y) = (*) 

By putting in, one gets the equation 

BA(z) = z, 

which expresses 1 -(1 1)1 , (2) m a matrix form 


1 -( 1 1 ) 12 Special matrices A diagonal-matrix is a square shaped 
matrix, the elements of which are equal to zero, except those in the diagonal 


Hence 


D = ((d'„)), d\ = 0, if t=fck 
d\ = di 


DA 

AD 


( 

( 


d 1 a 1 , 

dj a\_, 

, d, a 1 , 

d a a"! 

d„ a\. 

, d„ a n , 

d i a \ 

d i a\, 

* > d„ a , 

d, a " 1 

d , a" 2 , 

, d„ a ”, 


) 

) 


(1) 

( 2 ) 


An elementary matrix is a square shaped matrix 

Eri (X) = ( (e'h) ), r^= s , 
for which e\ = 1 (i = 1, , n) 

e\ = A 

e‘ K = 0, for * k, and (i, Ic) ^ (r, s) 


To multiply A from the left with E rs (A), means a row-addition in A, by 
which the row a r of A is replaced by a r -f A a" To multiply A from the 
right with E r „ (A), moans a column -addition in A, by which the column 
a 0 of A is replaced by a, + A « r 


1-(11)2 Decomposition of Matrices By the method of sweep out, it 
has been shown that every matrix can be transformed into a matrix of a 
special type by row-addition, row-omission and row-multiplication [ see 1-4, 
theorem 1 ] In a similar way, it will now be shown that a square shaped 
matrix can be transformed into a diagonal-matrix by row addition and 
column-addition In terms of matrix-multiplication this proposition can be 
enounced as follows 

Theorem Every square shaped matrix A can be represented as a pro- 
duct 

A = Pj D P 2 , 


( 1 ) 
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where P 2 and P J are products of elementary matrices, and D is a diagonal- 
matrix 

Proof If the matrix is the zero-matrix, then it is already a diagonal- 
' matrix, otherwise one can arrange by column-addition (if necessary) that 
at least one element m the first column is different from zero, and by row- 
addition that a 1 j 0 Thus one can sweep out the first column by row- 

addition, and subsequently one can sweep out the first row by column- 
addition without altering the first column If the matrix is not already 
a diagonal-matrix, one can arrange now by row-and column-additions 
without altering the first row and the first column that a 1 1 ^L0 , then the 
second column and the second row are swept out This procedure can be 
continued up to the matrix is made a diagonal-matrix D As every row- 
addition means a multiplication with an elementary matrix from the left, 
and similarly a column-addition corresponds to an elementary-matrix as 
a right hand side factor, formula (1) holds 

A representation of A as a product of diagonal-and elementary 
matrices is also called a decomposition of A into these factors 

1-(1 1)3 The determinant of a matrix product Let D be a diagonal- 
matrix, then 

det D rf, d a det ( e 1 , , e") = d x d n (1) 

As a multiplication with an elementary-matrix from the left (right) hand 
side means a row-(eolumn) addition only, the multiplication with elemen- 
tary matrices, or with products of them, does not alter the determinant 
Wheieas by the multiplication of any matiix by D, the determinant is multi- 
plied by det D as is seen from 1 (11)12, (1) and (2) 

Hence, if A = P x D P 2 , then det A — d y d n 

Consider AB 

det AB = det DP 2 B 
det P 2 B = det B 

detAB = det D(P 2 B ) = d x . d n det P 2 B = det A det P„.B 
= det A det B 


Hence 

det A B = det A det B. 

(2) 

1-(1 1)4 

The inverse of a linear transformation. 



By A(x) = (y) 

(1) 
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the n- vectors (x) are transformed into the vectors (y). If det A = 0, then 
the rank of the vectorspace generated by the n- vectors (y) is less than n, 
and therefore the ( x ) are not generated by a linear transformation of the 
n-vectors (y) Let det A ^ 0. Then det A x k = 2 A' k y t holds Hence 


A' (y) — (x), 


where 



and 6' k = A*! det A 


A A' (y) = (y) 


( 2 ) 


Hence AA' is a transformation which transforms every w-vector into ltBelf 
Let C be any matrix C ( e *) is an n-vector which is equal to the * th column 
of C If therefore C(«‘) = (e 1 ), then C is the diagonal matrix 



( 3 ) 


Hence A A' = E. 

Furthermore A' A(x) — (x) 

Hence A' A = E 

The matrix A' is said to be the inverse matrix of A and is mostly denoted 
by A V The necessary and sufficient condition for the existence of an 
inverse matrix is det A = 0 

E is called the unit-matrix For every matrix A, 

A E = A = EA (4) 

holds, and 


E = E- 1 . 



CHAPTER II 


FUNDAMENTALS OF GENERAL ALGEBRA 

2-1 Principal Notions 

2-11. Fundamental laws Let 

a, b, c, (1) 

be arbitrary numbers , then there exists, for every pair of them a uniquely 
determined number s, the sum 

a -f b = s (2) 

and a uniquely determined number p, the product 

a b — p (3) 


The opeiations of forming sums and products are called addition and 
multiplication 

These operations satisfy the following laws 


Commutative laws 

a + b = b + a 

(4a) 


a b — b a 

(4m) 

Associative laws 

( a d - b) + c = a + (b + c) 

(5a) 


(ab)c = a(bc). 

(5m) 

Laws of inverse existence 

For every pair a, b there exists an 

x such that 


x + a = b 

(6a) 

For every pair a, b satisfying the condition 



a 0 

(6°) 

there exists an y such 

that 



y a = b 

(6m) 

Distributive laws : 

(a + b)c — a c + 6 e 

{7) 


a(b + c) = a b + a c. 

69 0. P —7 
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Though these formulas are fundamental for calculation with numbers, 
they do not characterise completely the notion of number Of course in the 
different branches of mathematics, the word “number” is used in different 
senses. One speaks eg of natural numbers, rational numbers, complex 
numbers, hypercomplex numbers etc, The rational numbers as well as the 
real numbers and the complex numbers form three systems each of which 
satisfies the above conditions, whereas the system formed by the natural 
numbers does not satisfy the laws of inverse existence, the reader may verify 
it On the other hand, in the system formed by all the analytic functions of 
a complex variable, all the laws mentioned before hold, though the system 
is not composed of numbers In this chapter, systems of any land will be 
considered where either all these laws, or some of them hold In general, 
nothing will be supposed about the nature of the mathematical elements 
which form these systems The essential thing is that the elements are 
interconnected by the help of certain operations which obey particular 
laws These operations will sometimes be called rational operations 

2-12 Modules A system of elements for which an operation satis- 
fying the conditions (4a), (5a), (6a) of 2-11 is defined, is said to form 
an abelian group or a module * The system formed by the rational numbers 
is an instance of a module , similarly the system of the real (the complex) 
numbers The vectorspaces (see 1-3) are modules of a different type , the 
elements of them are n-vectors, and the addition but not the multiplication 
of w-vectors has been defined Consider especially n = 3 The 3-vectors 
can be represented by vectors in the space , thus one gets m this way 
modules which are formed by geomtrical entities (vectors) Moreover to 
every vector there corresponds a parallel displacement of the space trans- 
forming the starting point of the vector into its endpoint, and conversely 
to every parallel displacement of the space there corresponds one and only 
one vector To the sum of two vectors a + /? there corresponds the parallel 
displacement which is generated by performing succesivcly the two parallel 
displacements corresponding to a and to /? Thus the parallel displacements 
of the space form a module , this module therefore is composed of elements 
which are geometrical transformations 

The intergral numbers form a module in which besides addition and 
subtraction also multiplication can be performed , whereas division is 
possible in special cases only 

* There is no essential difference between the meaning of the two words 
The term “ addition ” and the sign -f are only notations, and there is no harm in 
replacing them by other words In these cases it is unusual to speak of “modules” 
and the word module is replaced by “ abelian group 
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The rotations about any particular axis form a module which is connected 
with the module of the real numbers as follows Let g be a particular 
positive integral number, and let to every real number a correspond the 
rotation through the angle 2ira g, then there corresponds a rotation to every 
real number, and to two real numbers a and /8 there corresponds the same 
rotation if and only if a — f3 is an integral multiple of g Two rotations 
corresponding to the numbers k and A generate, if taken one after the other, 
a rotation which corresponds to * -f- A The rotations for which a is 
an integral number, form a finite system which is a module Each of these 
rotations corresponds to one of the integral numbers 0, 1 , 2, , g — 1 , 

which occur as residues when an integral number is divided by g Two 
integral numbers corresponding to the same rotation (and therefore to the 
same residue) are said to bo congruent 

a = b (mod g) , 

they form a claso of residues modulo g Tt will be proved later on that these 
classes form a module , this module is of a special interest, especially in the 
case when g is a prime number 

2-13 Partition into classes In the example considered just before, a 
partition of the set of all integers into classes has been generated by a 
congruence of its elements This consideration will now be generalised 

An arbitrary set A of elements a, b, c, . . may be decomposed into 
classes, so that every element belongs to one and only one class Two 
elements are said to be equivalent , written a b, if they belong to the same 
class Then the equivalence has the following properties 

a r~- a law of reflexivity 

If a ~ b, then b — - a law of symmetry (1) 

If a b, b c, then a ^ c law of transitivity 

Hence to every partition of a set A into classes, there corresponds an 
equivalence of its elements, such that the three law's (1) are satisfied for this 
equivalence The usual way of mathematical investigation however is 
the converse one An equivalence between the elements of A is given, and 
from this equivalence, a partition into classes is derived It has been shown 
just before that an equivalence which generates a partition into classes 
must satisfy the conditions (1) , by the following lemma it will be established 
that these conditions are not only necessary, but also sufficient for a 
partition into classes 
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Lemma. Given an equivalence between the elements of A satisfying 
the condition (1), and lot (a) (b), (c), . . be the classes of the elements 
equivalent respectively to a, b, c, . , then each element of A belongs to 
a class, and two classes have either all elements in common — i e , they are 
identical — or they have no common element 

Proof. As a r~* a, the element a itself belongs to the class (a), formed 
by the elements equivalent to (a) If b is a common element of (a) and (c), 
then it follows from the law of symmetry that a and c belong to (b). 
From the law of transitivity it follows that each element of (a) and of (c) 
belongs to (6), and that each element of (6) belongs to (a) and to (c) 
Therefore (a), (b) and (c) are identical 

Each element of a class wdl be called its representative, and we will 
use the notation 

(a) = the class represented by a (2) 

By the method of forming classes, often new mathematical entities are 
created From the operations on the original elements, operations on the 
classes are derived in the following manner . 

Let an operation, say addition, exist for the elements of any system A, 
and let a partition of A into classes be given The sum of two classes is 
mostly defined by 

(c) + (d) = (c + d) (3) 

But this definition is admissible if and only if the class (c -f- d) is the same 
whatever elements c and d are chosen as the representatives of their classes 
Similar in the case when the operation is multiplication. 

Now the congruence (mod g) defined in 2-12 is an equivalence satis- 
fying obviously the conditions (1). This congruence generates a partition 
into g classes of residues 

(0), (1) ( g _ l). (4) 

The lemma proved just before will be appbed to these classes. As every 
mteger is congruent to its residue after division by g, (a) = (a') if and only 
if a as a' (mod. g) , furthermore if (a) = (a'), (b) = (b'), a' = a + r g, b' = 
b + s g, hence (a + b) and (a' + b') = (a -j- b + [r + s] g) denote the 
Bame class. It is therefore admissible to define addition of classes by : 


similarly : 


(a) -f- (b) = (a + b) ; 
(a) (b) = (a b). 


(5) 
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From (5) it follows directly : 

(a) + (b) = (b) + (a), [(a) + (b)] + (c) = (a) + [(b) + (c)] 

(a) + (b - a) = (b), 

(a) (b) = (b) (a), [(a) (b)] (c) = (a) [(b) (c)] 

[(a) + (b)](c) = (a) (c) + (b)(c), (a) [(b) + (c)] = (a) (b) + (a) (c) 

Hence the classes of residues (mod g) form a module in which a second 
operation, the “multiplication” is defined satisfying the conditions (4m), 
(5m) and (7) of 2-11 

2-14 Singular elements Although a module is not necessarily a set 
of numbers, there exists m every module an element which has about the 
same properties as the number zero, and which is therefore used to be denoted 
by the character 0. 

Theorem In a module M there exists one and only one singular 
element 0, such that for every element a of M, a + x equals (does not 
equal) a, if x is the singular (a non-singular) element 

Proof Let a be any particular element of M As the law of inverse 
existence (6a) of 2-11 holds for the addition, as defined m M, there must 
exibt an element, say 0 of M satisfying the condition 

a + 0 — a (1) 

Th® the sum of a and 0 is a itself It will be proved now that this 
element 0 has the same property with respect to every element of M, and 
that it is unique Let b be an arbitrary element of M, then there exists 
an element c of M, satisfying 

c + a = b 

Hence 

h + 0 = (c -f- a ) -f- 0 = c [a -f- 0) = c -f- a = b. 

Hence 0, when added to any element 6 of M gives b. 

To prove its uniqueness, one may suppose that there exists another 
element, say 0 m M, so that a + 0 = a, then 0 has the same properties as 
0, and the elements of M will not change, when 0 will be added Hence 
0 + 6 = 0, but on the other hand 0 + 0 = 0 holds Hence 0 = 0 

There exists in M an element a', for which 

a ■+ a' = 0 (2) 

holds. The uniqueness of a' is a consequence of the following consideration. 



64 


ALGEBRA I 


Let c + a = c + a = 6, then 

e = c-f-tf = c+(a + a') = 6+fl'= : (c + a) + a' = c + 0 = c 
Hence 

Theorem The equation x -(- a — b has one and only one solution 

2-15 Operations m a module The addition of a' is the operation 
inverse to the addition of a The addition of a' will therefore be called 
the subtraction of a, and the following notations will be used 

a' = - a (1) 

d -f- a' = d — a (2) 

Since — a' = a, — (— a) = a holds 

Now (a t + a 2 ) + a 3 = a x + (a, + a 3 ) , hence one can omit the brackets 
in this sum By mathematical induction one can prove m the same manner, 
as is done in elementary arithmetic, that the brackets in the sums of n 
elements can be omitted In place of a, | -f . -f a n one writes 
sometimes 

n 

2 , 

i=l 

for abbreviation this notation is often replaced by X an or by Ja,, when 

J 

there is no ambiguity 

If o, = a 2 = = a n = a, the sum will be denoted by 

n a ; (3) 

thus n is a positive integer, and not necessarily an element of M 

The product of a non-positive integer with an arbitrary element a 
of M is defined by 

( — n)a = — (n a) = n( —a) 

0 a = 0 

Hence for all integers p,q the following distnbutive laws hold 
. pa -f qa = (p + q) a 

- p (a + b) ~ pa 4- p5-i^ ' 


( 3 ’) 
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A system 8' which is formed by elements of any set 8 is said to be a 
subset of jP In particular, a subset M' of a module (an abelian group) M 
is said to be a submodule (a subgroup) of M if informs a module with respect 
to the addition defined in M 

Theorem A subset M' is a submodule of M, if and only if the 
differences of the elements of M' belong to M’ 

Proof The condition is obviously necessary Let the condition 
hold, and let a and b be elements of M' then b — b = 0, 0 — b — — b, and 
a —{—b) — a + b belong to M' Hence the addition defined in M can be 
carried out in M' , for this addition the associative and the commutative 
laws hold, and the equatation x + a = 6 can be solved by the element b — a 
of M' Hence M' is a submodule of M 

2-16 Bings If in a module T, a multiplication satisfying the associa- 
tive and the distributive laws are defined in such a manner that the product 
of every pair of elements belongs to T, this module is called a ring, and 
if m a ring the multiplication satisfies the commutative law (4m) of 2-11, 
the ring is said to be a commutative ring 

The integral numbers form a commutative ring B , other instances 
of commutative rings are the sets of the integral multiples of an integral 
number g, and the classes of residues (mod g) Non-commutative rings 
are e g formed by matrices (see ch VI] 

A subset of a ring T, which itself is a ring with the addition and the 
multiplication, as defined in T as its operations, is said to be a subring of T 

Exercises (1) A subset T' of a ring T is a subring of T if and only 
if the differences and the products of the elements of T' belong to T' 

(2) Replace the law of reflexivity by the following condition “To 
every element of A there exists at least one element which is equivalent to 
it”, and show that this condition in connection with the laws of symmetry 
and transitivity implies the law of reflexivity 

As the distributive law holds in a ring, 

0 = cc — cc = c(c — c) = c 0 
= {c — c)c — 0 c 

Therefore it is impossible to satisfy the condition (6m) of 2-11 in any ring 
without restriction If we want to introduce the condition that to every 
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pair of elements a, b there should exist such an y that 

ya = b, 

we are compelled to make *the restriction 

a 0. 

We get this restriction from formula (60) of 2-11 by replacing the number 
0 by the singular element 0 of the module T ; when mentioned in the follow- 
ing, the formula (6o) of 2-11 should always be understood m that way. 

2-2 Fields If in a commutative ring which contains more than 
one element the condition (6m) with the restriction (6o) of 2-11 holds, it 
is said to be a field In other words a field is a set of more than one element 
in which an addition and a multiplication are defined, and the conditions 
(4a), (4m), (5a), (5m), (6a), (6m) with (6o) and (7) of 2-11 hold for these 
operations 

Theorem If a, b are elements of a field F and a b = 0, then at least 
one of the factors a, b must be equal to 0 

Proof Let a -fc. 0, b 0 From (6m) and (6o) of 2-11 it follows that 
there is m F an element r such that c b^ 0 (eg cb — a) and an element 
y such that y a = c Then 0 = y 0 = y ab = c 6 0 Hence tho 

theorem 

From this theorem it follows, that the elements 0 of a field F form 
a system 2 where the multiplication is commutative and associative, and in 
which for every pair a, b of elements of 2, the equation a x = b has a 
solution in 2 Hence the multiplication satisfies in 2 the same conditions, 
as the addition must satisfy m a module, only the sign of -f- has been 
replaced by the notation of the multiplication So one can “translate” the 
results of 2-15 from the ‘additive language” into the “multiplicative 
language”, 

2-21 Nullelement and Umtelernent From 2-14 there follows the 
existence of a unique singular element 1 satisfying the condition 

a 1 = 1 a — a (1) 

for every element a of 2 But, as (1) holds also for a = 0, it follows • 

Theorem 1. There is one and only one element 2 in a field satisfying 
(1) for every element a of the field. 

To every element a of X there is an element or 1 satisfying 
a a 1 = o' 1 a = 1. 


( 2 ) 
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By translating the theorem of 2-15 into the “multiplicative language” on© 
gets : 

Theorem 2 If a and h are elements of a field F and a=£ 0, then the 
equation a x = b has one and only one solution m F, namely x = cr 1 b 

Corresponding to the sum of n elements one can form the product 

n 

r JJ" a, of n elements, and if the elements are all equal, the product is the 
power 

(3) 

The powers with non-positive integral exponents are defined by 
a" = 1, a _n = (a”)' 1 = (o' 1 )". 

Hence for every pair of integers p, q the equation 


holds 


a“ a q - a r ‘ +<> 


(4) 


The necessary and sufficient conditions which a field must satisfy, 
can also be given in the following manner . 

A system F of more than one element, for which the addition and the 
multiplication are uniquely defined, is a field if and only if . 

1. The elements form a module ; the singular element may be denoted 
by 0 

2 The elements different from 0 form a system 2 of at least one element 
which is an abehan group with respect to the multiplication. 

3 a 0 = 0 a = 0 

4. a (b + c) = a b + a c, for arbitrary elements a, b, c of F. 

The singular element 0 of the module will be called the nullelement or zero- 
element 

The singular element 1 of 2 will be called the unvtelement. 

1 ^ 0. (5) 

2-22, Homomorphism, Isomorphism and Automorphism. Let the 
elements o, b, c, .. of an arbitrary ring T be represented by the elements 

«(«)» “(&). «(«). ••• 

ao rt i» a 
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of a set A, and let an addition and a multiplication exist in A for which 
the formulas 

ala) + a(b) = a(a + b), 

( 1 ) 

a(a) a(b) = a ( ab ) 

are satisfied for all elements a, b of T, then the representation is said to 
be a homomorphism and A is said to be homomorphic to T The element 
a(a) of A is said to be the image of the original element a of T The 
representation of a by <*(a) is denoted sometimes by 

a — i d(ffl) } 

one says that a particular homomorphism maps A on T There are 
homomorphisms where different elements are mapped on the same image 

Theorem 1 A is a ring If T is commutative, A is also a commu- 
tative ring 

Proof Every element of A is of the form a(a), where a is a suitable 
element of T, not necessarily defined uniquely by its image a (a) 

fa(a) -f «(&)] + «(c) = a(a + b -} c) = a(a) -f [a(6) + o(c)] 

re(ffl) -)- a(b) = a(a -j- b) = a(b) -(- a(a) (2) 

a(o) f “(6 — a) — a (b) 

Hence A is a module 

The nullclement of A is a(0), since 

a(a) + <*(0) = a(o -f- 0) = a(a) 

holds After replacing the notion of addition in (2) by that of multipli- 
cation, one realises that the multiplication in A is always associative, and 
that it is commutative if T is a commutative ring To verify the 
distributive laws consider 

a (a) [<*(&) + a(c)] = a(a) a(b c) = a(ab -f- ac) = a(ab) + a(ac) 

= a(a) a(b) + a(a) a(c) 

From this formula and the commutative law, the second distributive law 
follows 

Hence the theorem 

Exercise Prove a{a n ) = a(a)", for any positive interger n, and 
a(mo) = m a (a), for any integral number m. 
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Now a(6) + a(— b) — a(0) ; hence 

a( — b) = —a(b), and therefore 
a{a — b) — a(a) + (— b) = a(a) —a(b) Hence 

Theorem 2 The necessary and sufficient condition for a(a) = a(b) 
is a(a- — b) = a(0) 

The number of the original elements which are mapped on a(0) is important 
as is seen from the following corollary 

Corollary Either every element of A is an image of more than one 
dement of a ring T, or every element of A is an imago of one and only one 
element of T 

In the second case A is said to be isomorphic to T, and the homomor- 
phism is called an isomorphism An isomorphism is therefore a (1,1)- 
correspondence of two rings by which the sums, differences and products 
of corresponding pairs of elements correspond In some cases therefore 
isomorphic rings are considered to be equal or to differ by the notation 
only of the elements, but m general, the notion of isomorphism must be 
distinguished from that of identity A ring T may have, e g two different 
subrings which are lsomoiphic Of special interest is the case when two 
sets are isomorphic and form the same set , tins isomorphism is called an 
automorphism Hence automorphism is a permutation of elements for which 
addition and multiplication are invariant The isomorphism mapping A 
on B and conversely is sometimes denoted by A < — > B 

Examples I 1 he ring of the classes of residues (mod g) is homomor- 
phic to the ring of the integers 

2 In the ring of the numbers a + b 2 (a, b integers), the transfor- 
mation a + 6-^2 — > a — 6^ 2 is an automorhpism 

2-23 The ring of classes of residues generated by a homomorphism 
If a is a homomorphism of a ring T, there is a partition of the elements 
of T into classes, two elements a and b being equivalent if a(a) = a(b), 
i e , if a(a — b) = a(0) These classes are classes of residues as consi- 
dered in 2-13, but they are of a particular kind as the elements c for which 
a(c) = (0), form a subring T' of T, with the property that each product 
of an element of T and an element of T' belongs to T' The converse of 
this statement is contained in the follwing theorem . 

Theorem I. Let a ring T contain a subrmg T', and let every product 
in T, one factor of which is an element of T', be contained in T' , then the 
classes of residues of T' in T form a ring which is homomorphic to T. 
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Proof. Let r, r\ . be elements of T', and let us consider two 
elements of T to be equivalent if and only if their difference belongs to T'. 
This definition satisfies the conditions of 2-13, and furnishes therefore a 
partition of T into classes. Furthermore 

[a + rl + th + r' ] a + i, and [a + r][b + r']~ab. 

Hence it is admissible to define the addition and the multiplication of the 
classes (a), ( 6 ), by the corresponding operations exercised on their 

representatives : 

(a) + (b) = (a + b), (a) (b) = ( ab ). 

Every rational formula holding for elements a, b, . . remains correct if these 
elements are replaced in the formula by the classes which they represent 
Hence the classes form a module m which the associative law of multipli- 
cation and the distributive law hold, i e they form a nng which is homo- 
morphic to T 

Consider e g the rings which are homomorphic to the ring R of the 
integral numbers To investigate this important class of rings, one has to 
find out all the subrmgs of R having the desired property Every subring 
is also a submodule , in the case of R, it is easy to prove that also the 
converse holds 

Lemma Every submodule of the ring R of the integral numbers is 
a ring R e consisting of the multiples of a non-negative integral number g 

Proof If ± g is contained in a submodule M of R, then M contains 
Eg Now M either consists of 0 only, and is therefore equal to E„ or it 
contains a minimum positive integer, say g Let m be any number out of 
M and m = kg + g', where g' is the residue 0 S g' < g Then g' is an 
element of M, and since g is the smallest positive element of M, g' = 0. 
Hence m is contained in E s , and therefore M = E s 

It is obvious that R s is a commutative nng, and that it has especially 
the property, supposed in the preceding theorem, that the produot of any 
element of E and an element of R s belongs to E g . 

To get the rmgs which are homomorphio to E, one hew to consider the 
classes of residues of R s in E (for g = 0, 1, . . .) 

Let g = 0, then E„ consists of the number 0 only. Every class of 
residues oonsists of one element only. The nng formed by these classes is 
therefore isomorphic to E 
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Let g = 1, J?j is identical with R There is one class of residues only, 
all the numbers are equivalent to 0. The ring formed by the classes of 
residues consists of the nullelement only 

Let g > 1 The classes of residues form a commutative ring G 
consisting of the elements 

(0), • , (g - 1) 

which has already been considered m 2-13 Q is homomorphic but obviously 
not isomorphic to R One has to distinguish whether g is prime, or not 
This distinction is afforded by the following two theorems 

Theorem 2 If g > 1 is not a prime number, the classes of residues of 
g form a commutative ring 0 which is neither itself a field, nor a subnng 
of any field 

Proof That G is a commutative ring, follows directly from the 
preceding theorem As g > 1 is not prime, g = r s, where both the factors 
are positive and less than g The classes of residues (r) and (s) are both 
different from (0), but their product (r) (s) = (0) Since in a field the 
product of two elements which are different from the nullelement, is itself 
different from the nullelement, G cannot be a field nor a subring of a field. 

Theorem 3 The classes of residues 

(0), (1), .... (p - 1) (1) 

of a prime number p form a field GF f . 

Proof To prove the theorem, it must be shown only that for every 
particular pair of classes (a) ^ (0) and (b), the equation 

{a) (x) = (b) (2) 

has a sulution Let (x) run overr the p classes (1), and consider the 
products (a) (x). If two of them are equal, say (a) (x,) = (a) (x 2 ), then 
(a) (x, — x 2 ) = (0), and theiefore a [Xj — x 2 ] is divisible by p But since 
for (a) (0), a is not divisible by p, the factor x A — x 2 must be divisible 

by p Hence (xj = (x 2 ) The p products (a) (x) form therefore p different 
classes One of them must be the class (b) , the equation (2) has there- 
fore a solution Hence the theorem 

By this theorem it has been hown that a field may be homomorphic 
to a ring which is not itself a field One may wonder whether a ring can be 
homomorphic to a field , especially it is interesting to know, whether a field 
can be homomorphic to a field without being isomorphic to it. 
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Theorem 4 When a nng A which contains more than one element is 
homomorphic to a field F, then A is isomorphic to F, and A is therefore 
a field 

Proof Let a, b, c, d, denote elements of F If a and b are repre- 
sented by the same element of A, then a — b is represented by the nullele- 
ment Let c be represented by an element a of A, which is different from 
the nullelement, and put c (a — b) = d The product d(a — b) must 
be represented by the nullelement, since a — b is represented by it , this 
contradicts to the supposition that d (a — b) — c is represented by a 
Hence any tw r o different elements of F must be represented by different 
elements of A , i e A is isomorphic to F, hence it is a field 

A ring can therefore be homomorphic to a field in the two trivial cases 
only, where the ring either consists of a nullelement only, or is isomorphic 
to the field As a field is supposed to contain more than one element, 
it follows 

Corollary If a field is homomorphic to another field, it is isomorphic 

to it 


2-24 Subfields of a field Let 

M v M 2 , (1) 

be modules, finite or infinite in number If these modules have common 
elements, these form a module, w r hieh is called the meet of the modules 
(1), since if a and b are common elements, then a — b belongs to M, as well 
as to M 2 , and is therefore a common element In the special cases 
where the modules are rings, the products ab belong also to the meet which 
is therefore a ring If the modules are fields, and the meet contains more 
than one element, the meet itself is a field A subring F' of a field F which 
is itself a field, is said to be a subfield of F , this connection is denoted by 
F'CZ F and if F' F, by F' C F If a subset X of F has the property that 
the sum, the difference, the product and the quotient of any two elements 
of X (the divisor being ^ 0) belong to X, then A is a subfield of F Indeed 
the commutative, associative and distributive laws hold m X as they hold 
m F The meet of all the Subfields of a field F contains at least the elements 
0 and 1, and it is therefore a field which is called the primefield of F. The 
pnmefield of F is therefore a subfield of F and it is also a subfield of every 
subfield of F , the primefield has no other subfield than itself, and it is the 
only subfield with this property. 
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Theorem The elements which one gets by repeated addition, sub- 
traction, multiplication and division of the unitelement of F and of the ele- 
ments generated m this manner, form the pnmefield of F. 

Proof. Every subfield of F must contain an element different from 
the nullelement, say the element a Furthermore it contains a — a — 0, 
a a = 1, and it contains all those elements which are generated by repeated 
addition, subtraction, multiplication and division of 1, and of the elements 
generated successively by these operations Hence these elements form a 
set P which is contained in the pnmefield of F On the other hand, the 
sums, differences, products and the quotients — unless the divisor is the null- 
element — of two elements of P belong to P Hence P is a subfield of the 
pnmefield and P is therefore the pnmefield itself 

From this theorem it follows that the pnmefield of the field of the 
complex numbers is the field of the rational numbers, and that the fields 
OF p are their own primefields 

2-25 Primefields To investigate the pnmefield of F consider the 
module generated by the unitelement I of F This module is a submodule 
of this pnmefield, and it consists of the elements 

» 1 ( 1 ) 

where n takes all the integral numbers As has been shown in 2-15, 

pi + q7 = (p + q) i 

To prove the corresponding multiplicative formula 

[ P 1 ] [q 1 ] = P q F (2) 

we use mathematical induction The formula is obvious for q = 0 and every 
arbitrary integral value of p From 

[pi] t(q ± l) i] = [pi] [qi ± i] = [pi] [ q i] ± P i, 

it follows that if (2) holds for a particular value of q, then [pi] [(q ± 1)] 
= pqi ± pi = p (q ± 1) i , thus (2) is correct also for q + 1 and q — 1 
Hence (1) holds for every pair of integral numbers p, q The module 
formed by the elements (1) is therefore a ring R *, and is homomorphic to 
the ring of the integral numbers Hence there is no harm in introducing 
the shorter notation 


n 1 = n. 


(3) 
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p -f q denotes the element of F which corresponds to p + q, and it is 
simultaneously the sum of p and q ; the corresponding holds for p — q and 
p q. One should notice that the italic characters 

0, ± 1, ± 2 n, .. ( 4 ) 

denote elements of F, whereas the roman characters 

0, ± 1, ± 2, . . 

denote integral numbers which may not be elements of F. On the other 
hand, the elements (4) may not be different. 

Let eg F be a field GF P [see 2-23, th 2], say GF C , then 2 = 7, etc 
The notation 0 for the nullelement tallies with the notation of (2-1). The 
ring of the elements (4) is homomorphic to the ring of the integral numbers, 
and it is therefore isomorphic to a ring of classes of residues. The null- 
element corresponds to a subring R k of the ring of the integral numbers 
[ see 22-3 ]. The number g is called the characteristic of the field F. 

2-26 Fields of characteristic p Let g > 0 Then R* is isomorphic 
to the ring G, as considered in (2-23) From the 2nd theorem of 22-3 it 
follows that g is a prime number p and R* is therefore isomorphic to GF P . 

Hence R* is a field, and since R* is contained in the primefield which 
has no subfield different from itself, R* is the primefield of F. Hence . 

Theorem If the characteristic of a field F is different from zero, it 
is a pnmenumber p The primefield consists of the elements 

1,2, . ,p = 0 

and is isomorphic to GF P . 

That every pnmenumber is indeed the characteristic of a suitably chosen 
field, is obvious by the example of the fields GF P In a field of a charac- 
teristic 2, a = — a + 2a = — a holds , the positive sign and the negative 
sign are therefore not different The calculation m fields of characteristic 
2 is much simplified by this fact , on the other hand there exists no 
arithmetic mean of two elements For this reason the case of fields of 
characteristic 2 has sometimes to be considered separately. 

If p is a primenumber, the binomial coefficients 

/ P ^ = P ! 

« \ m ✓ m 1 (p — m) 1 
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are divisible by p for 0 < m < p, since the primefactor p does not occur 
in the denominator If one calculates in a field of characteristic p, those 
numbers have therefore to be replaced by zero Hence in a field of charac- 
teristic p 

(a + by = a f +6» (1) 

holds Replace b by — b, then 

(a — b) v = a” — 6 P (2) 

holds for every odd primenumber p, the same formula holds also for p = 2, 
since in fields of characteristic 2, subtraction is not different from addition. 
Thus (2) is true for every field of characteristic p 

2-27 Fields of characteristic 0 Let g = 0 , then R* is isomorphic 
to R, and therefore it cannot be a field , to find out its nature, apply the 
following theorem which holds for fields of any characteristic 

Theorem 1 Let A be a subring of a field F, and let A contain more 
than one element , the meet of those subfields of F which contain A is a 
field -consisting of the solutions X of the equations 

a x — b, 

where a -■£- 0 and b are elements of A 

(In other words The meet of the subfields is proposed to consist of the 
quotients of the elements of A) 

Proof The meet of those subfields is a field containing the set X of the 
solutions One must therefore prove only that X itself is a field. Since 
A is supposed to contain an element a yt 0, and a 1 — a, a 0 — 0 hold, the 
elements 0 and 1 belong to X Therefore X contains more than one element 
We need to show only that the sum, the difference, the product and the 
quotient of any two elements of X (the divisor being supposed 0) belong 


to X 

Let a x x 1 = b v a 2 x s = 

= bn j 

, then 



a 2 {x x ± 

x 2 ) 

= — ®i b 2 , 

(1) 


a, a, (x. 

x 2 ) 

= b 2 , 

(2) 

and if 

i^O, and therefore 

6i 

7^ 0 



a | 1)2 (r^ 


= b 2 a 2 . 

(3) 

Hence 

the theorem. 
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The meet P of all the subfields of F which contain R* is a subfield of F 
and contains therefore the primefield , on the other hand, the primefield 
is one of the subfields of F containing R* , hence P is the primefield of F. 
If F is supposed to be of characteristic zero, to every pair of integral numbers 
r and s 7 ^ 0, there corresponds a pair of elements r and s 7 ^ 0 in R* and there 
exists therefore in P a quotient r s From (1), (2) and (3) it follows that 
addition, subtraction, multiplication and division of those quotients are done 
in the same manner as the corresponding operations for rational numbers 
Hence P is homomorphic to the field of the rational numbers A field cannot 
be homomorphic to a field, unless the two fields are isomorphic P is there- 
fore isomorphic to the field of the rational numbers Hence 


Theorem 2 The primefield of a field of characterise 0 is isomorphic to 
the field of the rational numbers 

, Sz 

fi-f o j 

2-28 Quotient a ' The methods used 111 2-27, will now be applied 
to characterise those rings which are aubrmga of a field As in every field 
the multiplication is commutative, a ring which is a subring of a field must 
be a commutative ring , as furthermore the product of two elements Which 
are different from 0, is itself different from 0, therwurte ^fclds m every sub- 
ring of a field It will be shown that these two -‘hedessary conditions are 
also sufficient j Q Q Q % 

Theorem Let A be a commutative rin^ with the property that any 
pair of its elements which are different from zero, form a product which 
is different from zero Then A generates a field which is called the quotient- 
field Q(A ) of A The field Q{A) contains a subring A' which is isomorphic 
to A and every element of Q(A) is a quotient of two elements of A' 


Proof Consider the pairs of elements a, b of A for which 6 7 ^ 0 These 
pairs are distributed into classes by the help of the following equivalence : 


a, b a', b' 


if there exist elements c and d of A such that c 0, d 0, ca = da 1 and 
ch = db' This equivalence is obviously reflexive and symmetric To prove 
the transitivity, suppose a', b' ^ a ", b", hence c'a' = d'a", c'b' = d'b", 
where c’ 7 ^ 0, d‘ 7 ^ 0 Then cc' a = dd' a", cc' b = dd' b" Since cc' 7 ^ 0, 
dd' 7 tfl, a,b — a " , b" Thus the equivalence generates a partition of the 
pairs into classes The class represented by a, b will be denoted by a/b. 
Especially ajb = cajcb. Addition and multiplication of classes wffi now 
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be defined by the following formulas (well known from the calculation with 
fractional numbers) 

a 1 /6 1 -f a 2 jb, = {a 2 b 2 + a 2 6,)/6 , b 2 , ' (1) 

a 1 jb l a./b, = a, a 2 /6 j 6 2 (2) 

It must be proved that these formulas are independent of the choice of the 
representatives 

Let a,, bi ~ a'„ b\, c, a , = d t a'„ c, 6j — d { b\, for i = 1 , 2 ; 

then 

a 2 a 2 jb 2 b, = Cj c 2 a L a 2 jc 1 c, b 2 b 2 — d 2 d 2 a\ a' 2 ld l d 2 b\ b' 2 = a\ a\jb' x b' 2 

Similarly it is shown that 

(a L b 2 + a 2 b l )/b l b, = ( a\ b' 2 + a\ 6' 2 )/6' 1 b' 2 

The two commutative laws and the associative law of multiplication are 
obvious Now 

(ajb + C Id) f ejf = (adf + cbf -f bde)jbdf = a/6 + {cjd+ejf) 
(associative law of addition), 

a/6 c/d + a/6 elf = ab (cf + ed )jb- df - a/6 (c/d + *//) = (c/d-f «//)a/6 
(distributive laws) 

The equation a/6 j- \jy — c/d is solved by 

x — cb — ad, y = b d 0 , 

the elements ajb foim therefore a commutative ring Ojb is its zero-element. 
For 07 L 0, 

ajb ujv = c/d is solved by u = be, v — a d =£ 0 
The ring is therefore a field, say Q(A) 

For every particular element a of A, there is an element cajc of Q(* 4 ) This 
element is uniquely determined by a, since 

cajc = dajd 

As m 2 - 13 , denote this class by ( a ), then it follow's from ( 1 ) and ( 2 ) that 

(a) + (6) = (a + 6) 

(a) (6) = (a6) 
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The elements of type (a) form therefore a subnng, say A' of Q(A) which 
is homomorphic to A If a ^ b, then ca/c ^ cbjc , hence the homomor- 
phism is an isomorphism Finally 

ajb (6) = (a) 

Therefore every element of Q(A) is the quotient of two elements of A'. 
Hence Q{A) is the quotient field of A, and the theorem holds 

2-29 Relation between a field and its svbrmgs Integral domains. 
To embed the ring A into a field of which A is a subring, we use the following 
lemma 

Lemma Let A be a subring of a ring B, and let A' he & ring which is 
isomorphic to A , then there exists a ring B' which is isomorphic to B and 
which contains A' as a subring 

Proof Denote the elements of A by a,, a 2 , , in general by the letter 

a with an index but without a dash , the lemaining elements of B be 
denoted by /?,, fl.,, with indices, but without dash Consider now an 
isomorphism J n mapping A on A' and denote by a'j the element of A' 
corresponding to a p where j runs over all the indices which occur If 
«i + «j — “u and a, = a m , then it follows from the isomorphism that 
a', -f- u'j = a' k and a', a' , = a' nl Create now new elements j3\, 
corresponding to the elements j3 lt fl,, of B These new elements together 
with the elements of A', form a set B' which is in a (1,1) corespondence to B, 
of course the correspondence can be generated by simply affixing a dash 
on the notations of the elements of B The addition and the multiplication 
in B will now be defined in this way If for any elements p, o, t and p. of 
B, p + a = r and pa = p hold, then p’ o’ — t’, and p' <r' = p By this 
definition B' is made a ring and the mapping of B on B' generated by 
affixing a dash becomes an isomorphism J, for the elements of A' this 
isomorphism is identical with J a , and the addition and the multiplication of 
elements of A' is the same as it was defined originally Hence the ring 
A' is a subring of B' 

If m particular B is a field, then B' is a field and A' is embedded into B' 
as a Bubring Applying the lemma to the preceedmg theorem, one there- 
fore gets the theorem 

Theorem 1 If A is a commutative ring in which 0 cannot be represen- 
ted as a product of two elements different from 0, then A is a subnng of a 
field which is isomorphic to Q(A) 
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Definition A commutative ring containing a umtelement is said to 
be an integral domain if the product of any two of its elements which are 
both different from zero is itself different from zero 

Hence An integral domain can be embedded into a field The close 
connection between A and its quotientfield Q(A) is shown by the following 
theorem 

Theorem 2 If A is a subrmg of F, the meet of all the subfields of 
F which contain A is isomorphic to Q(A) 

Proof As it has been shown in 2-27 the meet X of those subfields 
consists of the quotients of the elements of A Addition, subtraction and 
multiplication of these quotients are given by the formulas (3), (4) and (5) 
To every element of Q(A ) there corresponds an element of X, and as the 
formulas (1) and (2) of 2-28 for the rational operations in Q(A) tally with 
the corresponding formulas of 2-27 for the elements of X, the field X is 
isomorphic to Q(A) Since Q(A) is a field, the theorem follows from the 
corollary in 2-23 

2-291 Identification The notion of isomorphism and the lemma of 2-29 
can be generalised Consider any system of mathematical objects which 
may be subject to certain operations Modules and rings are instances 
of such systems , the objects are called “elements” In a module there 
is one opeiation (addition), in a rmg there are two operations (addition and 
multiplication) Anothei example of a sy stem of this kind is the “affine 
space”, its objects are “points” and “vectois” , the operations are addi- 
tion of vectors, multiplication of vectors with real numbers, addition of a 
point and a vector A system is therefore not uniquely determined by the 
objects alone , of course the same set of objects furnishes different systems 
if the operations are different In the same module a “multiplication” 
can be introduced possibly m more than one w ay , thus one may construct 
two rings which are different, though composed of the same elements If 
m the geometrical system just introduced as “affine space”, m addition 
to the operations already introduced, the relation of “scalar product” is 
established, the affine space becomes a “metric space” which is not the same 
system as the affine space Let now 2 and 2' be any two systepis with the 
same operations, but possibly different elements, and let there be a (1, 1)- 
correspondence J generated by mapping the elements o, /?, . . of 2 on the 
elements a', /3', . . of 2' , let furthermore R be one of the given opera- 
tions by which from a v , /?„, . results R(a„, /?„, , . , y„) = k„. If R{a' v , 

fi'r, •••, y\) = K \ is the element of 2' on which k„ is mapped by J, then 
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R is said to be invariant for J If every operation of X is invariant, then 
J is said to be an isomorphism and X is isomorphic to X' Obviously the 
isomorphism satisfies the 3 conditions for an “equivalence” For rings and 
fields, this general definition of isomorphism tallies with the definitions 
given earlier The lemma of 2-28 can be generalised now m the following 
way 

“Let X and T be two systems with the same operations and let every 
element of X be an element of T (i e % is a subsystem of T) If X' is 
isomorphic to 2, then there exists a system T' which is isomorphic to T and 
which contains S' as a subsystem” 

This lemma is proved in the same way as the lemma in 2-28 The 
reader may work out the proof as an exercise. 

Often, isomorphic systems are considered to be equal They are con- 
sidered to be different representations of the same thing E g one speaks 
of the affine plane though there are different planes which are isomorphic 
only, but for affine planimetry one needs to consider the common properties 
of all these planes, and it is therefore convenient to take these planes 
as representations only of “the” plane It is not always possible to 
proceed so , in solid geometry one has to consider sunultanously different 
(isomorphic) planes, say jp, and p 2 , and these may intersect in a line, say s, 
whereas if pi and p, are considered to be the same plane, every point of them 
is a common point , thus one has to distinguish between isomorphic systems 
m this case Similarly m Algebra As far as the properties of a particular 
ring R are investigated, it is not necessary to make any distinction between 
R and a ring R' which is isomorphic to R On the other hand if one dis- 
cusses a ring S which contains two isomorphic rings R and R', it is necessary 
to make a distinction between them There are however cases where two 
isomorphic systems 2i and S 2 will not be considered later on as subsystems 
of a larger system, e g when by the introduction of X 2 , the system 2i 
becomes superfluous In this case, there is no harm to identify them, 
t e. to consider them as the representations, or as diffeient names only, of 
one and the same thing 

Now consider the ring A of the last theorem of 2-28, then its quotient- 
field Q{A) contains a subring A' isomorphic to A From the lemma it follows 
that A can be embedded into a field B which is isomorphic to Q(A) , it 
is now very convenient to identify B with Q(A) and A with A' Of course, 
if on building up elementary arithmetic one extends the nng of the integral 
numbers, at first the quotientfield of the fractions ajb is introduced, and 
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then the fractions a/1 are identified with the integral numbers a; otherwise 
it would be necessary to make a distinction between the fractions ajb 
and the rational numbers a b That could be done, one is not loosing any 
logically important step in renouncing identification, but the mathematical 
language must become very heavy and overloaded by isomorphisms To 
understand this clearly, a short review of the steps leadmg from the 
notion of natural number to the notion of complex number may be helpful. 

(1) Out of the two signs +, — and the natural numbers a, form all the pairs 
+ a and — a These pairs together with a new symbol 0 form a system 
Addition and multiplication are defined now in such a manner that the 
system becomes a ring R and the subsystem of “positive numbers” + a 
is isomorphic to the system of the natural numbers Natural numbers 
are identified with positive numbers R is the ring of integral numbers. 

(2) Form Q(R) and identify R with the subring of the factors with deno- 
minator 1 Q(R) is the field of the rational numbers 

(3) Form the Dedekmd sections m the ordered system of the rational 
numbers For a suitable definition of addition and of multiplication, these 
form a field D The primefield P of D consists of those sections which are 
determined by rational numbers P is identified with Q(R), and D is called 
the field of the real numbers 

(4) Form pairs (a, b) of real numbers a, b For suitably chosen operations 
of addition and multiplication, these pairs form a field F, and the elements 
( a, 0 ) form a subfield which is isomorphic to D Identify D with this sub- 
field F is the field of the complex numbers 

Thus four identifications are performed to get the complex numbers 
starting from the natural numbers It is well known that there are also 
different ways , e g one can define real numbers as continued fractions, 
or as decimal fractions etc Similarly complex numbers may be introduced 
by classes of residues etc Obviously a Dedekmd section is a thing which 
is different from a continued fraction The real numbers defined by con- 
tinued fractions form a field which by its substance is different from the field 
of the real numbers defined as Dedekmd sections , however the two fields 
are isomorphic Every mathematical statement holding in one of these 
fields holds also in the other one In investigations on mathematical logic, 
it may be necessary to consider these two fields simultanously, and there- 
fore to make distinction between them In “pure mathematics” there is 
no reason to do so , thus we are justified in identifying these two fields 
(and some others), and to speak pf “the” field of the real numbers. 
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2-3 Polynomials 

2-31 Preliminary investigations Let 

x, a 0 , cq, j a ni b a , b \ , , b m 

be elements of a ring, then this ring contains also elements 

a o + a \ x + + a a x " =* 2 *1 X 1 (1) 

0 

m 

and b„ bi x -f- . -f- b m x n = % bj x J (1 ) 

o 

Suppose x to be commutative with every element of the ring From the law's 
of addition and multiplication (see 2-1) as far as they are bound to be satis- 
fied m a ring of this kind, it follows that 


2 u, x' + 2 6, x‘ 


N 

= £ % x" 


ll 

2 a, x' 


m N 

— 2 b } x> — 2 d v x v 

o o 


( 2 ) 

(3) 


where 


n 

2 a. x' 

o 


ni 



o 


M 

X 1 = 2 x ", 


o 


(4) 


N g n and m, M > n + m, a l>n = 0, b l>m = 0, 


a v + by, 

(2') 

Gy b t , 

(3') 

2 a, 

(4') 


''l-M 


If v is greater than n and m, s,, — d„ — 0, similarly for y. > n + m, g v = 0 
Independently of x, the element (1) of the ring is not altered if some terms 
with coefficients equal to zero are added For particular values of x, such 
expressions may determine the same element, even if they differ m every 
coefficient, e.g x — 2 and x s -j- 2x — 4 are equal for x = 1 It ib of a 
fundamental importance that there exists a class of rings in which two 
elements (1) are equal if and only if corresponding coefficients are equal 
(terms with zero-coefficients being omitted) Every ring can be extended 
to a ring of this kind by the operation which will be described now. 
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2-32 Definition of a polynomial For the purpose mentioned above, 
one starts from a ring R, whose elements are denoted by 

&oi ®li > j ^ o> i , C„, Ci, , c x , . . 

Introduce now a symbol x which is not used for the notation of elements of 
R Then create new elements, the polynomials m x over R which are denoted 
in the same manner as formula (1) of 2-31 It may be emphasised that the 
polynomials are not yet elements of a ring, but they will be made so by 
suitable definitions The system of these polynomials is called 

R\x] (1) 

As usual, the elements a,„ a ,, , a u are called coefficients, the symbol x 
is said to be an indeterminate Two polynomials are considered to be equal 
if and only if, after omission of the terms with zero-coefficients, they tally 
m every coefficient This definition of equality is admissible, as it satisfies 
obviously the laws of reflexivity, symmetry and transitivity For abbre- 
viation, we are allowed to onnt terms with zero-coefficients, furthermore 
we may omit any coefficient w'hich is equal to 1 (piovided that a umtelement 1 
exists in R) Thus the symbol x can itself be considered as a polynomial 

x = 0 + 1 x (2) 

This formula is not trivial, as — up to now — the sign -f in 2-31, (1) is a 
mere symbol and not the sign of addition Addition, multiplication and 
subtraction of polynomials have not yet been defined, but suitable defini- 
tions will be given below 

Definition If a, = 0 for i > d, but a A 0, then d is called the degree 
of the -polynomial % a , x' If every coefficient is zero, the degree is equal 
to — 1, 


By this definition, to every polynomial a definite integral number 2: — 1 
is alloted as its degree Equal polynomials have the same degree The 
polynomials of degree — 1 are all equal, whereas for degrees 2 0, there exist 
different polynomials of the same degree The polynomials of degree 
< 1 are of the type 

a o + 0x+ +0x\ (3) 

where a 0 runs over all the elements of R These polynomials form a subset 

R° [x] (4) 
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of _R[x] Every polynomial (3) is, by the definition of equality of poly- 
nomials, equal to a polynomial a 0 Thus there is a (1,1) — correspondence 
between the elements of the ring R and those of R°[x] so that corresponding 
elements are written in the same way The distinction between R and 
2?°[x] will disappear later on, 

2-33 Rings of polynomials 

Definition The sum [the product] of two polynomials of R[x] is a 
polynomial in R[x ] , the coefficients are determined by (2') [by (4')] of 2-31. 

Theorem i?[x] is a ring i2°[x] is a subring of R[x\ which is iso- 
morphic to R, corresponding elements being denoted in the same way. 
A polynomial is the sum of one-term polynomials a OJ a t x, , a n x", and 
each polynomial a n x ” is the product of the polynomial a„ (of degree < 1) 
and the polynomial x n 

Proof To show that R[x] is a ring, one has to prove that the commu- 
tative law of addition, the associative laws of addition and multiplication, 
the distributive laws and the law of inverse-existence for addition hold 
Since these laws hold in R, one can easily derive them for R\x\ by the help 
of the formulas of 2-31 The commutative law of addition follows directly 
from (2) and (2') of 2-31 

Z s y x p +X c„ x v = Z tv 

where t, = s„ -f c,, = a„ + b v + c„ = a„ + (b v -f c,,) Hence Z t, x v = 
Z « f . x v + (Z b v x v -f Z c„ x v ), i e the associative law for the addition 
of polynomials Similarly one shows that {Z a,, x" Z b v x") Z c„ x '’ and 
Z *' (2 bp x " X Op x") are both equal to Z u v x", where u„ = Z b x c M . 

P=K+X+H 

It may be left to the reader, to check the tw r o distributive law r s in the 
same manner If d„ is defined by 2-31, (3'), Z d,, x v -f J = 2 a v X v 
follows from (2’) of 2-31 Hence h![x] is a ring The remaining proposi- 
tions of the theorem are immediate consequences of the definition of 
addition and multiplication of polynomials 

Since i?°[x] is isomorphic to the ring R, one identifies the elements of 
jR°[as] with the elements of R which are already denoted m the same maimer 
It is therefore not necessary any more to distinguish between the polynomial 

(of degree < 1) and the element a„ of R. The rmg i?[x] is an extension 
of the ring R since by the identification carried out just before, R becomes 
a subring of I?[x] As every subring (and even every submodule) of a ring 
(a module) contains the nullelement of the rmg (module), the rings i?[x] 
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and R hare the same nullelement I e the nullelement 0 of R is also the 
nullelement of R[x\ Furthermore if R contams a uiutelement 1, this 
unitelement is also the unitelement of i?[ar] 

Since a polynomial has been proved to be the sum of its terms (which 
are polynomials with at most one coefficient different from zero), one can 
interchange the terms -without altering the polynomial, e g one can write 
the terms in the opposite order 

On *“ + a D . t X "- 1 + + a 0 

Exercises Let R be a ring which may or may not be commutative 
but contains a unitelement 

1. Prove that x u is commutative with every polynomial of R[x] 

2 Let -ftf-c] be a subring of a ring R } , prove that R\x\ is the meet of all 
those subrings of R 1 which contain R and x 

2-34 Commutative rings of polynomials 

Theorem 1 If R is a commutative ring, then is a commutative 
ring . 

Proof By the theorem of 2-33, it has been shown that R[x] is a ring ; 
one has only to check the commutativity of the multiplication Of course 
the commutativity follows from 2-31, (4') and 2 aj b s = 2 b, a r 

itj-m l+j-m 

Theorem 2 If R is an integral domain, so is R[x] 

Proof As in consequence of theorem 1, R[x] is a commutative ring con- 

n ra 

taining the unitelement 1, one has to prove only that if 2 a v x" and 2 

O 0 

n+m 

are of degree g 0, then the same holds for there product 2 g n Without 

O 

loss of generality we can suppose that o„ yt 0 and i> m ^ 0, hence g, um = 
a n b m 0 Hence the theorem 

In a similar way one proves 

Theorem 3. If R is a subrmg of a field, then the degree of the product 
of two polynomials of [a;] is the sum of the degrees of the factors, 
provided the factors are both of degree > 0 

Exercise Show that the condition “R is an integral domain” cannot 
be replaced in theorem 2 by “ R is a commutative ring” 
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Theorem 4 R[x~\ is not a field 

Proof If R consists of the nullelement only, then R[x] is equal to R, 
and as it consists of one element only, it is not a field Let a ^ 0 be an 
element of R, then Uf-r] contains ax If R [x] is a field, If is a subring of a 
field , hence theorem 3 can be applied On the other hand there exists 
in R[x\ a polynomial of degree, say m, which is inverse to ax The product 
of the two polynomials is equal to 1, and therefore it is a polynomial of degree 
0 Now m >*— 1, but from the theorem 3 it follows that m + 1 = 0 
This is a contradiction 

2-35 Integral functions Let C be a commutative ring containing 
the unitelement 1 , then the same holds for C[x] Let a„, a 1 , , a„ be 

elements of C, and A an arbitrary element of C[r], then 

fl„ h + a i A. 2 + + a u A" = /(A) (1) 

is again an element of C\x] The correspondence mapping A on /'(A) is 
called an integral rational function over C or briefly (as we are only concerned 
with Algebra) ‘an integral function over C These functions map the 
elements of C on elements of C, and the elements of C[x] on elements of 
C[a"] Especially 0 is mapped on a u , and x (which indeed is an dement 
of C[x] as 1 is supposed to be an element of C) is mapped on the polynomial 
f(x) which has the same coefficients as (1) has On the othei hand, given 
any polynomial in x ovei C\ there exists an integral function over C having 
the same coefficients , one has only to replace the indeterminate x by the 
variable A running over the elements of C\x] or a subset- of them 

Again consider all the polynomials m a ovei 0, say 

fix), fi(x)> 

and any particular element of C[x], say A To every element /(a) of C 
let correspond the clement /(A) which also belongs to C[x\ Let f t (x) 
and f,(x) be any two elements of Cjx] 

fi(x) = S« i x\ fjx) = X b j x‘, f L {x) + f‘(x) = f,(x) = S»„ x\ 

= tfx) = 'S,g ll x> 1 

Then the coefficients a,, b t , s„, g^ are interconnected by 2-31, (2') and (4:'). 

Since C[x] is a commutative ring, A is commutative with the elements 
Oj and Hence (see 2-31) the addition and the multiplication of £ a t A 1 
and X bj X> is done also in accordance with 2-31, (2), (2'), (4) and (4'). 
Therefore 
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/i(A) + /,(A) = /.(A), /i(A) / 2 (A) = / 4 (A) hold. 

The correspondence /(a;) /(A) is therefore a homomorphism By it, 

the ring of all the elements of C[x] is mapped on a subring C x which is 
homomorphic to C[x\ 

Let A = b be an element of C Then every element of C b is an element 
of C, on the other hand the polynomials a: + (a — b) of C[x] are mapped 
on a, which is supposed to be an arbitrary element of C Hence C h = C 
Thus every element of C generates an homomorphism mapping C[x\ on C 

Exerciser Consider the homomorphism generated by an element A 
of C[x ] (1) if A is of degree >1, (2) if A is of degree 1 and C is a field 

By an homomorphism the nullelement is always mapped on the null- 
element If therefore 


F(Ux), ,/„(*) ) = 0 , 

then 

,/„(A) ) = 0 

for every A out of G[x ] Hence 

Theorem Every equation between elements of C[x ] remains correct 
if for x one puts any particular element of C[j ] 

It may be mentioned that the commutativity of C is essential for the 
validity of this theorem 


2-36 Polynomials tn two mdeterminates Derivatives Let R be a 
ring, x and y be mdeterminates, then 

= T, R[y ] = S 

are also rings T[y] consists of the polynomials in y of which the 
coefficients are polynomials in x over R, i e it consists of the sums 

2 a j k & y*-, (1) 

where aj k runs over the elements of R. The same holds for S[x] Hence 

T[y\ - (£[*]) [y] = £[*] = R[x,y] (2) 

This statement can easily be generalised for any number of mdeterminates. 
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R[x j] x lt . , x n ~\ 

without making any distinction about the order in which the extension is 
performed 

Let 

<p(x) = % a i x>, and % <p {0) = 0, then a 0 = 0 and <p(x) = x <Pi{x), 

where <p x (x) is a polynomial m x over the same ring as <p{x) is, provided 
this ring contains a umtelement _ 

Let D be an integral domain, as in 2-25 the sum of n terms, each being 
equal to 1, is denoted by n Now let f(x) be a polynomial of D[x ] = T, 
then f(x + y) is a polynomial of T[y] 

n* + y)- &*) = m, m = o 

Hence 

F(y) = y F i(y) 

Now F x (0) is an element of T — D[x], say 

F x (0) - /'(*), ( 3 ) 

where f'(x) is a polynomial in x over D and is uniquely determined by f(x) 
The polynomial f'(x) is said to be the derivative of f(x) If D consists of 
real numbers only, this notation tallies with what in Analysis is called the 
derivative of an integral rational function The reader knows that in 
Analysis the notion of derivative applies to a much larger class of functions 
of a real variable than rational functions only Here, a derivative is defined 
for polynomials only, but the coefficients are not necessarily real numbers 
The formulas for the derivatives of sums and products are the same as in 
Analysis even the proofs are nearly the same, the only consideration being 
that there is no passage to limit Readers are advised to compare carefully 
the following proofs with those given in analysis to understand clearly the 
difference between an indeterminate and a variable which takes real values 
some of them making the functions posBibly senseless 

Let 

f(x) = f L {x) + /,(*), 

then 

y F Av) = Ui( x + y) ~ />(*)} + {/ 1 (* + y) - M x )} 

= y{Fn(y) + *»<*)} 

y{Fi(y) - F n(y) - F n(y)) = o. 


Hence 
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Since D and therefore D[y] are integral domains, and y is not the nullelement 
of D [y], the factor in brackets is zero. Put y — 0 then from (3) follows : 

f(x) = /» + At*) W 


Similarly, let 

9( x ) = IA X ) M x ), 

then 

y QA X ) = { fA x -by)- fi( x )) M x + y) + { fA x -by)- M x )) fA x ) 
= y{At(y) / 2 (* + y) + 


wherefrom it follows bj the same consideration as above that 



£'(*) = f\( x )fi( x ) + r >(x) f fix) 

(5) 

Furthermoie f'(x) = 0 for /(x) = c 



/'(x) = 1 for /(x) = x 

(6) 

From (4), 

(5), (6) it follows in the same way, as in analysis : 


For 

fix) = 5 ^ xJ, /'(x) = 2 j a s x)-’ 

(7) 


2-37 Homogeneous polynomials Again let D be an integral domain, 
x v , x u , t be indeterminatcs, denote 

— D\x> t , x n J, D^ — D[x^j Xy, . , x„], . ., D u — H\x ll , x n _ x ] 

( 1 ) 

hone© 

S = Dfo, , x n ] = D u [x k ], for k — 1, . . n (2) 

Every polynomial f(x v , x n ) of S can be considered to be a polynomial 
in x k over J9„ 

f( x v > x u)= AM, k=l, (3) 

Consider now the polynomial 

/(tei, • •, te„) 


which belongs to <S'[<] 

Definition /(x 1( ...,x n ) is said to be homogeneous of degree m if 
/(te x , tej = /(Xj, . . ., x m ) (4) 

holds. 


Let /(Xj, . , , x 0 ) be homogeneous of degree m. Put 

/(*!» • • •» *n) = S • • •» x n"b 


( 5 ) 
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where to different j correspond different sets ( ..,, 10 ,) Then 
t m /(x a> . . x n ) = f{tx ly , tx,,) = Xa, x^i, . , x„ w i f'i + + w i 

Hence Sj + + w 3 = m, for every occuring j (6) 

is a necessary condition for /(x,, , x n ) to be homogeneous of degree m. 

Obviously this condition is also sufficient Forming the n derivatives and 
multiplying each of them with the corresponding indefinite x k , one gets 

f ](Xi) — S Uj x 3 9 j x n w j 

a-,, /' „(*„) = £ w, a, x/i x„"j, 

where, as in 2-25, (3), s s stands for the element which one gets by 
taking the unitelement Sj times By addition it follows from (6) that 

X *u /V( x ) = wi fi r 1. . *,.) ( 7 ) 

k-l 

holds (Euler ’s formula) 

2-4 Factorisation 

Fundamental notions Suppose R to be a commutative ring If a, b, c 
are elements of R, and 

a b — c (1) 

holds, a and b are said to be factors of c, and c is said to be divisible by a and b. 
Whereas in a field every element is divisible by every element different from 
zero, there is no corresponding theorem for rings As some rings — eg 
the ring of the integral numbers, and the rings R\x] — play an important 
role 111 mathematics, it is necessary to consider the mutual divisibility of 
elements of certain classes of rings which are not fields 

Let D be an integral domain If every element of D is divisible m D 
by a particular element say e, then 1 is divisible by e, and therefore e' 1 
belongs to D If on the other hand, e and e 1 belong to D, then for every 
a of D , the elements a e _1 and ae belong to D , hence every element a of 
D is divisible by e and e _1 Thus the elements which are factors of every 
element of D are exactly those elements of which an inverse element exists 
in D The unitelement 1, for instance, has this property , so these elements 
are called the unities of D If e l and e 3 are unities, then the came holds 
for Cj e 2 and e x e/ 1 . Thus the unities m D form a multiplicative comma-, 
tative group 
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Some examples • 1 Let D be the ring of the integral numbers, then 

-f 1 and — 1 are the only unities. 

2. Let F be a field, then every element different from zero is a unity 

3 The unities of F[x] are the polynomials of degree 0 (t e the elements 
of F, with exception of 0) 

Let a and b be two elements of D for which a b = e is a unity of D ; 
then a and b are said to be associated This association of elements 
satisfies the laws of reflexmty, symmetry and transitivity , thus D is parti- 
tioned into classes of associated elements (associates) 

Every element of D is divisible by its associates and by the unities A 
non-zero and non-unity element which is not divisible by any other element 
than its associates and unities, is said to be a prime-element Eg m the ring 
of the integral numbers, an element is associated to itself and to its negative , 
the prnnenumbors taken with positive or negative sign are the prime- 
elements In an arbitrary field the non-zero elements are all unities, but there 
might be elements which are neither zero nor unities and are non-divisible by 
any prime-element provided the domain D is suitably chosen E g consider 
the numbers 

®o + a i 2 l -f- a 2 2 1 + a n 2 2 , (2) 

for n = 0, 1, and a„, a,, being integral numbers The numbers (2) 
form an integral domain D with the unities 1 and — 1, the prime-elements 
being the odd pnmenumbers with positive or negative sign The number 
2 is not a unity and is not divisible by any prime-element of D 

If an element can be represented as a product of unities and prime- 
elements it is said to be factonsable , the representation itself is called a 
factorisation Obviously a prime-element is always factonsable If an 
element a is non-factonsable, it is neither a unity nor a prime, and it is 
divisible by an element 6 which is not associated to a and is itself non- 
factonsable Two factorisations of an element will not be considered to be 
different if there is a (1, ^-correspondence between the prime-elements, 
corresponding prime-elements being associated If all the factorisations of 
an element a are equal in this sense, then the factorisation of a is said to be 
unique If the factorisation of every element which is neither zero nor a 
unity is unique, then one says that the factorisation m D is unique Thus the 
factoristion in a field is unique 

Exercises (1) Construct an integral domain D which is not a field 
but has no prime-element, 
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(2) Investigate the extension of the notions explained here to rings 
which are not integral domains 

2-42. Domains with factorisable elements 

Theorem Let D be an integral domain and to every element a 0 
of D let there correspond an integral positive number N(a), the norm, 
such that 

N(ah) > N(a), (1) 

where equality holds if and only if b is a unity, then every element 0 
in D is factorisable 


Proof If a is a product of non-unity elements, 


a — a 1 u m 

then 

N(a ) > N (a l + 1 2 N(a t a m _ 2 ) |- 2 . > m (2) 

Again, if a is not factorisable, it is the product of two non-unities, say 
a = a l b i of which at least one, say is not factorisable Hence 
a s=s a, &! = a I b, = a t a, a„ 6,„ 

where n can be chosen > N(a) and none of the factors is a unity But 
from (2), N(a) > n + 1 which is a contradiction Hence the theorem. 


Examples (1) Let D be the domain of the integral numbers, and N(a) 
be the absolute value N(a) — |a| Then N(a) satisfies (1) Hence the 
integral numbers are factorisable 


(2) Let D be the set of the numbers 

« + P\ (- 6), (3) 

where a, ft take all the integral values , then D is an integral domain 
Put 

N(« + P V (- 6)) =« ! +6r = (« + ^(- 6)) (a - p V (- 6)), ( 4 ) 

then N(ab) = N(a) N(b) (5) 

and for »^0, N(a) > 0 is an integral number. 

If N(a + /3 V (- 6)) = 1, then (a - £ yj (- 6)) = 1 (« + 0 j [- 6)); 
hence a + 6) is a unity Therefore 

N(ab) = N(a) N(b) > N(a) if b is not a unity On the other hand, if e 
is a unity, 

1 = N( 1) = N(e e- 1 ) = N(e) N(e i), 
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and therefore N(e) = 1 as 1 has no positive factor other than 1. Finally 
N(ae) = N(a) N(e) = N(a) Hence the norm (4) satisfies the condition (1). 
In this integral domain the factorisation is not unique, since 

6 = 2 3 = - fV (— 6 )] 2 

We have to prove that 2, 3 and yj ( — 6) are prime-elements in D N( 2) = 4, 
JV(3) = 9, N [yj (— 6)) = 6 If one of the 3 elements would not be prime, 
there must be elements of which the norm is equal to 2 or 3 If a is an 
integral number 

a 1 = 0, or 1 (mod 3) (6) 

according as a is divisible by 3 or not Again let N (a + /3 V (- 6) ) = 2, 
then a 2 + 6 /3 2 == 2, a 2 = 2(mod 3) contrary to (6) Let N(a + j8 (— 6) ) 
= 3, then a 2 + 6/J 2 = 3, therefore a is divisible by 3, say a — 3k multiply- 
ing with 2 3 one gets 6< 2 + 4/3- = 2, hence (2/3 2 ) == 2 (mod. 3) contrary 
to (6) Thus 2, 3 and yj (— 6) are prime-elements , so 6 can be factorised 
in two different ways 

2-43 Unique factorisation The following criterion is 'often used 

Criterion for uniqueness of factorisation Let D be an integral domain 
m which every element ^ 0 is factonsable The necessary and sufficient 
condition for the uniqueness of the factorisation is that no product ab can 
be divisible by a prune-element p, unless a factor a or b is divisible by p. 

Proof (1) Let the factorisation be unique in D One gets the factori- 
sation of c — ab by putting together the factorisation of a and that of b 
On the other hand c — dp, and one gets the factorisation of c, by putting 
together the factorisation of d and the prime-clement p From the unique- 
ness of the factorisation it follows that an associate of p occurs among the 
prime-factors of a or of b Hence a or & is divisible by p 

(2) Let the above conch tion for the products hold , then it can be 
proved by mathematical induction that if a a l a n is divisible by p, at 
least one of the factors is divisible by p Of course the proposition holds for 
n = 1 , suppose it to be true for n < m If a [a 1 a m ) is divisible by p, 
and a is not divisible by p, the product in brackets is divisible by p, and 
therefore one of the factors is divisible by p Let now c be factorised 
c — p l p lu and c be divisible by a prime-element p, then at least one 
of the prime-elements p i is divisible by p and therefore associated with p 
Suppose now, there exist in D elements which are not uniquely factonsable, 
and let r be the minimum number of prime-factors (which are not necessarily 
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all different) such that b = q 1 .. q t admits a different factorisation b = 
Pi ■■■ Pi (where s = or ^ r). Obviously r > 1. As & is divisible by p s , 
one of the factors q j is associated with p e = u q,, where u is a umty Then 
q i , q r l =p t . p t _ji are different factorisations, contrary to the suppo- 
sition that products of less than r prime-elements are uniquely factonsable 
Hence the theorem 

If to two elements a and b of D there exists m D a common factor 

(a, b) (1) 

such that every common factor of a and b is a factor of ( a, b ), then ( a, b) 
is called a highest common factor, or h c f of a and b Every element of 
D associated to (a, b) is also an h c / of a and b Conversely two kef. 
of a and b must, by definition, be divisible one by the other and are 
therefore associated Hence (1) is not determined uniquely, but up to 
a unity-factor only, provided that an h c f exists at all 

Suppose that to every pair of elements of D there exists an h c f , 
then every common factor of 3 elements, say a, b, c, is a common factor 
of (a, b) afid c, and therefore a factor of d — ( (a, b), c ) On the other 
hand, d is a common factor of a, b and c Thus there exists an h c f to every 
triplet of elements of D Obviously this h c f is uniquely determined up 
to a unity-factor. By repeating the procedure, the following result is easily 
obtained 

Theorem 1 If to every pair of elements of D an h c f exists, then there 

exists to every w-tuplet a v , a„ of elements of D an h c f 

K, . a n) = ( K. , «,.) (2) 

which is uniquely determined up-to a unity-factor An element of D is a 
common factor of the elements of the n-tuplet if and only if it is a factor 
of (2). 

The operation for h c f is commutative and associative , moreover 
it satisfies a distributive law : 

a (b, c) = (a b, a c) (3) 

Let D be uniquely factonsable, and 

a = pji pft . . p m r m, (4) 

where p,, p 2 , are different prime-elements. If 0 £ s } g r for 

j = 1, . ., m, then for every unity e, 
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e p* 1 . . p m ’m (5) 

ta a factor of a On the other hand, it follows from the above criterion that 
a and therefore any factor of a can be divisible by such prime-elements only, 
as are associated to p u As p/i p m r ™ is not divisible by p t , the 

element a and any factor of a cannot be divisible by p, r>r i Similarly 
for the other prime-elements Hence the elements (5) are the only factors 
of (4) Given two elements a and b of D. Using the notation p° t = 1, one 
can express both elements simultaneously by 

a = pS i . . . p m T m, b = u p t l i Pm™, where « is a unity (6) 

An element (5) is therefore a common factor of a and b, if 0 £ s, £ rj and 
Sj £ tj hold for j = 1, . m Let Wj be the smaller of the two numbers 
r, and t J; then 

(a, b) = p ^ i p m w n> (7) 

is an h c f of a and b Hence 

Theorem 2 If in D the factorisation is umque, there exists a highest 
common factor for every pair of elements ^ 0 

These considerations on unique factorisation tally nearly verbatim with 
the corresponding theory of elementary arithmetic The only essential 
difference consists of the unities which occur as factors and which remain 
arbitrary to a certain extent In the domain of the integral numbers, there 
exist two unities + 1 and — 1 and therefore the prime-elements occur in 
pairs of associated ones, but in arithmetic, one uses to give preference to 
the positive numbers, thus the factorisation becomes umque m a stricter 
sense. In the general case considered here, there is no way of making a 
similar distinction 

2-44 Euclidean domains Let c,, c 2 run over the elements of D, and 
a, b be two particular elements ; then the elements 

c, a + c 2 b (I) 

form a commutative ring which with every element contains also its mul- 
tiples and especially its associates Every element (1) is divisible by every 
common factor of a and b. If the % c f. (a, b) is an element of the form 
(1), then the set of the elements (I) consists exactly of those elements whioh 
are divisible by (a, b). This case is of a special interest 
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Theorem, 1 If the elements ^ 0 of D are factonsable, and for every 
pair a, b of elements of D, an h c. f exists, and can be expressed by (1), 
then D is uniquely factonsable 

Proof Let ab be divisible by a prime-element p, say ab = k p, and a 
not be divisible by p, then the highest common factors of a and p are unities ; 
hence a unity, say e _1 can be expressed by c a d p — e' 1 Hence 
(c e) (a b) + {deb) p = b = {cek -f- d e b) p Therefore b is divisible by p 
Hence the theorem follows from the criterion of 2-43 


Let 


®1» ® 2 > • > ®r,> a n t l (2) 

be elements of the integral domain D satisfying the following conditions 

Ol 4" ^2 a i ~ a 3 

a i + ^3 a 'i ~ a i 


( 3 ) 


a k-l 4 ^k ®k — a H*i 


a n-i + b n o„ = a JJ+1 

Then every common factor of a k _i and a k is also a common factor of a k and 
Ok+i, and conversely, and since this property holds for every k, each pair 
of consecutive elements has the same common factors If especially a„ n = 0, 
then a n is the h c f of a n and a,,-!, therefore every pair of consecutive ele- 
ments of the sequence (2) has an h c f and this factor is equal to Hence 

a n = (a,, af) (4) 

The construction of a sequence (2) ending with a„ tI = 0 is called the 
“Algonthmus of the h c f” This algorithmus allows one to express suc- 
cessively a 3 , a it , a n = (a,, a 2 ) as linear homogeneous functions of a 1 
and a 2 If therefore the elements 0 of h are factonsable, and the algo- 
nthmus can be performed for every pair of elements, then D is uniquely 
factonsable Thus it is very important to have a condition which is suffi- 
cient for the working of the algonthmus Lot in an integral domain D 
a norm-function N be determined which satisfies the conditions explained 
in 2-42 Let furthermore to every pair of elements a k .j =£ 0 and a„ ^ 0 
of D exist elements a k+1 and b k of D such that 


a k-i + K °k = ®k + l 


(5) 
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where either a kt , = 0, or iV(a ktl ) < N{a k ), 
then D is said to be a Euclidean domain 

Theorem 2 To every m-tuplet of elements of a Euclidean domain D, 
there exists in D an h c f which can be found out by a repeated appli- 
cation of the algonthmus , the h c f can be represented as a linear ho- 
mogeneous function of the n elements with coefficients out of D The 
factorisation in D is unique 

Proof Since a norm-function exists in D, the elements are factoris- 
able From (5) it follows, that given any two elements a, and a 2 , one can 

find a sequence a,, a s , , m D such that N(a 2 ) > N(aJ > , As 

the norms are positive integers, their sequence cannot have more than 
N(afi elements If N{a :i ) is the last norm in the sequence, it follows 

from (5) that a nti = 0 Hence a„ = (a,, aj can be found out by the 

algorithmus, and therefore a„ — ba , + ca 2 , where b and c are elements 
of D The theorem holds therefore for m — 2, and the factorisation in D 
is unique From 2-43, theorem 1 it follows that the h c f of every m- 
tuplet exists Suppose that 

( d, , , d k ) = di T -\-b u d k , 

and that this h c f can be found out by applying the algorithmus k — 1 
times, then 

(dj, . , d k|1 ) = ({d u , d w ), d ktl ) — d{b 2 d 2 -j- + b k df) c ktl d ktl 

= Ci d, + -f- c k+1 d u i 

can be found out by applying the algorithmus k times Hence the theorem. 

2-45 The domain of the integral numbers Consider the domain J 
of the integral numbers, and put 

N(&) = [ a | 

This function has obviously the properties postulated for a norm-function 
m 2-42, (1), and it satisfies also the condition 2-44, (5) Hence the factorisa- 
tion in J is unique, and the h c f can be found by the algorithmus The 
multiplicative properties of integral numbers are therefore determined by 
their representation as products of powers of pnmenumbers Thus it 
is important to know that there exists an infinite number of prime- 
numbers. 
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Proof (Euclid) Suppose there exists a finite number of primenumbers 
only, say the numbers p, ( j = 1, . . , n) , then (p x p t . . p„) + 1 is neither 
divisible by any pnmenumber, nor is it a umty Hence the supposition 
is wrong, and the above proposition holds. 

2-46 Homomorphism modulo a prime-element Suppose D to be an 
integral domain, where factorisation is unique, and let p be any particular 
prime-element m D The elements which are divisible by p form a subring 
Dp of D As has been shown in 2-23, the classes of residues of D p m D form 
a ring D" which is homomorphic to D By this homomorphism an element a 
of D is mapped on an element (a) of i> Hence (a) = (a -j- k p) for 
every element k of D The zero-element ( 0 ) of /> consists of the elements 
divisible by p, i e the elements of D p Since D is commutative, and Z) p is 
homomorphic to D, the ring D p is commutative It will be shown now 
that D p is an integral domain Suppose (a) (b) = (0), then (a b) = (0), 
that means that a b is divisible by p Since the factorisation in D is supposed 
to be unique, and p is a prime-element, at least one of the factors a, b must 
be divisible by p, i e one of the elements (a), (6) of D must be the 
zero-element Furthermore (/) is the unitelement of D 1 ’ Hence the 
commutative ring £> p is an integral domain 

2-47 Factorisation m F [x] Consider now the domain F [x] of the 
polynomials in x over a field F (see 2-34), with exception of the element 
0, for every element / (x) of F (x) a positive integral norm is defined by 

N(f(x) ) = degree (f(x) ) + I (1) 

The elements yt 0 of F are the unities Hence N(a) = 1 holds for unities 
only Applying 2-34, theorem 3, one gets for non-zero-elements a, b of F[x] 

N(a b) = N(a) + A(6) - 1 (2) 

Hence N(a b) > N(a), where equality holds if and only if 6 is a unity 
Let 


/,(*) = Xa„ x\ f 2 (x) = 2 b„ x“, n > m, b m yfc 0 

i i 

be two polynomials of F [x] The condition n > m does not involve any 
loss of generality, as e g a u may be equal to zero ; the condition b m =£ 0 
only forbids f 2 (x) to be the zero-element. 
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h( x ) - K b m ) x-“ f/z) = V a 1 , x" = ^(x) 


M x ) — (aVk • 6 m ) x n ' a ' t / 2 (x) = s a** 1 , x" = <? kn (x) 


m-i 

<Pn-m( x )~ (a“-“ m &„,) /,(x) = 2 a n -” +1 „ a:' = / s (x) 

o 


then A(z) + b,(x) f,(x) = f,(z), (3) 

where the degree of / 3 (x) is less than the degree of f.,(x) 

Hence 

eithei N(Ux)) *<N(fp)) (4) 

or / 3 (a) = 0 Hence 

Theorem 1 Let /,(x) and f.fx) ^ 0 be polynomials of F |x] (where 
F denotes a field), then there exist in F \x ] polynomials 6, (x) and f 3 (x) 
satisfying (3) and (4), and these polynomials can be calculated by a 
finite number of steps 

The method of calculating b, (x) and / 3 (x) is called the algorithmic 
of division and / 3 (x) is the remainder By theorem 1 it is established that 
the norm-function satisfies the conditions of 2-44 Hence 

Theorem 2 F [x] is a Euclidean domain , the h c / of any m ele- 
ments <p v , <ji rn can be represented as a linear homogeneous function 
gfx) <pfx) + + r Jm( x ) <p m ( x )’ and the factorisation is umque 

It may be mentioned that / a (x) is uniquely determined by (3) and (4), 
smoe if ffx) + c (x) ffx) = <p (x), /, (x) - ip (x) is divisible by f 1 (x), 
and i p ( x ) therefore cannot satisfy the conditions stated for / 3 (x) in (4), 
unless (x) = / 3 (x) holds A similar uniqueness does not hold in the 
domain J of the integral numbers Indeed 

12 - 7 = 5 N(5) = 5 < N(7) = 7 

12 - 2 7 = - 2 N(— 2) = 2 < N(7) = 7 

The prime-element3 of jF[x] are said to be irreducible polynomials. Since 
polynomials of degree zero are unities, an irreducible polynomial of degree 
1 cannot be represented as a product of two polynomials of degree 2: 1 

69 O. P.— 12 
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Hence every linear polynomial m x over F is irreducible A polynomial 
of degree > 1 which is not irreducible is said to be reducible If F ib a 
subfield of F lt every polynomial f (x) of F [x] belongs also to F, [x] , 
f (x) may be irreducible in F [x] and reducible in F 1 [a;] E g the poly- 
nomial / (x) = x 2 — - 2 is a polynomial of F [a - ], where F is the field of the 
rational numbers , in F [x], the polynomial / (x) is irreducible Let F, be 
the field of the real numbers, then f(x) — (x — yj 2) (x -f \ 2) is reducible 
m Ejfr] 

Theorem 3 Let f,( x ) and fA x ) be two polynomials of F [ar], and 
let there be no common irreducible factor of /,(a;) and /,(»), then there 
exist polynomials <)> 1 (x) and <f>A x ) of F [a:] such that 

<h( x )M x ) + <h ( x )fA x ) = 1 ( 5 ) 

holds * 

Proof Every highest common factor of f,(x) and f,( x ) 18 a unity 
of F(x), l e any element a 0 of F From theorem 1 it follows that there 
exist polynomials ip x (x) and f 2 (a-) satisfying <p,(x) /,(a) + <p 2 ( x ) fA x ) = a 
Putting \fifx) = a tpfx) (for l = 1, 2), one gets (5) 

If especially fAx) is irreducible, and /,(x) is not divisible by fA x )> then 
<pA x ) fA x ) = 1 (mod /.(a-) ), le 

<h(*) = {/if*)}" 1 (mod /,(r)) (6) 

In 2-46 it has been shown that m an integral domain with unique factorisa- 
tion, the classes of residues of a prime-element form an integral domain. 
As F [a;] is an integral domain with unique factorisation, the classes of 
residues of f(x) form an integral domain, and as (6) holds, these classes form 
a field Hence 

Theorem 4 Let / (x) be irreducible in F ( r), then the classes of resi- 
dues of / (x) in F [a-] form a field 

If /(*) is reducible, say f(x) ----- <ji(x) if 1 ( x ), then the classes of residues 
do not form an integral domain since f(x) is not a prime-element Of course 
between the classes, the equation (<ji(x) ) ('P(x) ) — (f (x) ) ~ (0) holds, 
though (<p(x) ) =£ (0) and (<p(x) ) ^ ( 0 ) 

The elements of the field considered m the last theorem are classes of re- 
sidues of / (x) Let n be the degree of f(x) and let /, (x) be any polynomial 
of F [x], then one gets by the algonthmus of division 

fAx) - b (x)f(x) =f 3 ( x ), 
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where /„(x) is the remainder of the division and therefore of a degree 

< n Since /,(x) — / s (x) is divisible by /(x), /,(x) belongs to the same 
class of residues as /,(x) , thus every class contains a polynomial of degree 

< n Let there be two such polynomials, say / 3 (x) and /,(x) in the same 
class, then /„(x) — / 4 (x) is of degree < n and divisible by / (x), hence 
it is zero Thus in every class there exists one and only one polynomial of 
degree < n which characterises the class Consider m particular the classes 
containing polynomials a„ of degree < 1 These classes form a subfield 
which is in a (1, 1) and isomorphic correspondence with the elements o„ 
of F This subfield is therefore isomorphic to F Hence 

Theorem 5 Each of the classes of residues as considered in theofem 4 
contains exactly one polynomial of a degree which is less than the degree 
of /(x), and characterises the class The classes characterised by poly- 
nomials of degree < 1 (l e elements of F) form a subfield which is isomorphic 
to the field F, every class being represented by its characterising element 

2-48 Factorisation in £)[x] Let I) be an integral domain with 
unique factorisation , consider the factorisation in D[x\ At first the divi- 
sibility of polynomials / (x) of D [x] by elements of D will be investigated 
In the special case where D is a field, every / (x) is divisible by every element 
of D which is different fiom zero as these elements are unities This special 
case has been investigated already in 2-47 

If / (x) is divisible by an elem >nt c of D, then 

/ (x) = c X b,, x" ~ X cb„ x v , (1) 

hence every coefficients of / (x) is divisible by c If the h c f of the co- 
efficients of / (x) is equal to 1, then / (x) is divisible by unities only, and 
conversely In this ease f{x) is said to be a pnmihw polynomial of D [x] 

Let p be any prime-element of D , as in 2-46, the subring of the ele- 
ments which are divisible by p will be denoted by D p 

Then D v [x] consists of the polynomials the coefficients of which are divi- 
sible by p, whereas {D [x]} p consists of the polynomials of D [x] which 
are divisible by p From (I) it follows that 

•®p[ x ] = (2) 

As the factorisation in D is unique, the classes of residues of D p in D form 
an integral domain D } of which D v is the zero-element (see 2-46) ; from 
2-34, theorem 2 it follows that Z>[x] also is an integral domain of which 
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the class Z> p [a;] is the zero-element The classes of residues of {D [z]}p 
in D [ x ] form a ring {D [x]} p which is isomorphic to D*[x] and is there- 
fore an integral domain Let <j>(x) = /j (x) f t ( x ) be divisible by p, 
then <p(x) belongs to the zero-element {D[x\] p of {D[a:]} p and therefore 
at least one of the two factors is divisible by p Hence 

Lemma If the factorisation in D is unique and p is a prime-element of 
D, furthermore /, (a-) and/ i (a-) belong to D[x] and f,(x) f 2 (x) is divisible by 
p, then at least one of the factors f,(x) and f 2 (x) is divisible by p 

Corollary I If ffx) and f,(x) are pnmitive polynomials, then 
fi( x ) M x ) 18 primitive 

Corollary 2 If a<p(x) is divisible by a primitive polynomial f(x), then 
<]>(x) is divisible by f(x) 

Ptoof Since a belongs to D, it is factonsable , a — p, p, . p n Let 
« <p(*) = /(*) /iW 

Since f(x) is primitive, it is not divisible by p l , hence from theorem 1 it 
follows that f^x) — p, f 2 (x), and therefore 

Pi • • p« 1>{*) = S( x ) J^M) 

By n — fold repetition of the procedure, one gets 

?(*) = / (*) /an ( x ) 

Corollary 3 If / (x) is factonsable in D [ar], then f(x) = a, . a m 
fj(x) f n (x), where f,(x), , f u (x) are primitive, and a x . . a m = a is the 

factorisation m D of the h c f of the coefficients of f(x) 

Proof A prime-element of D [x] is either an element of D, or it is 
a primitive polynomial since if the coefficients have a common non-unity 
factor, a polynomial is not a prime-element Let /,(*)>• •, /„(*) be the 
primitive polynomials of degree > 0 among the prime-factors of f(x), 
then the product of them is a primitive polynomial, say <p(x ) The product 
of the prime-factors of / (x), belonging to D is an element a of D. Thus 
f(x) — a <p(x) Hence a is the he f of the coefficients of f(x) The factors 
a, of a are prime in L>[x], hence they are also prime-elements of D and a = 
a x u m is a factorisation of a in D 

Theorem If the factorisation in D is unique, so the factorisation in 
D[x\ is 
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Proof. Since D is an integral domain, there exists a quotient-field J, 
of which D is a subring , every element of F is the quotient of two elements 
of D Let / ( x ) be a primitive polynomial of D[x\, then f[x] is also a poly- 
nomial of F[x], and it is uniquely factonsable in F [x] 

f(x) = ^(x) . <p n (x) 

The factors <j> ,(x) are determined up to a unity of F[x], i e up to an element 
of F Smce the coefficients of <Pi(x) are quotients of elements of D, there 
exists an element b l 0 such that b l ^(x) is a polynomial in Z)[x], hence 


b i fdx) = cj { (x), 

where /,(x) is primitive f^x) and <p,(x) are associated in F[x], hence 
/,(x) is a prime-element of F[x], moreover /,( x) is a prime-element m D\x), 
otherwise it must be a product of primitive polynomials in D\x], contrary 
to the fact that it is a prime-element of Fix'] Putting 6 , ... b n — b, 
^ . e„ = c, one gets 

bf(x) = c/,( x) f n (x) 

From the corollaries 1 and 2 it follows that the primitive polynomials f (x) 
and /j( x) . f,(x) are divisible one by the other one and are therefore 

associated Hence 


f(x) = e/,(x) f n (x), 

where e is a unity, and f t (x), , f,(x) are prime-elements of F[x] as well 
as of D[x] In the special case where f(x) is a prime-element of D\x], the 
number n is equal to 1 Every prime-element of D{x] which does not belong 
to D is therefore also a prune-element of F[x ] Since the factorisation 
of f(x) in F[x\ is unique, the primitive polynomial f(x) is factonsable in 
D[x] into prime-elements f { (x) of D\x] which are determined up to unities of 
F[x] l e up to elements of F But since each f v {x) is pnmitive, these 
elements must be unities of D Hence f(x) is uniquely factonsable into 
the prime-factors of a and those of / t (x), (t = 1, , n) these factonsa- 

tions being umque On the other hand it follows from Corollary 3 that 
every factorisation of a f(x) consists of a factorisation of a and a factorisa- 
tion of f(x) Hence the theorem 

This theorem may be considered as a generalisation of 2-47, theorem 1, 
but it does not generalise its full statement. Of course D[x] is not nece- 
ssanly a Euclidean domain, even if D is Consider e g the domain J of the 
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integral numbers which has been proved to be a Euclidean domain 
a(x) — x — 2 and b(x) = x + 2 are primitive polynomials of J[x] and are of 
degree 1 , hence they are prime-elements Since ± 1 are the only unities, 
they are non-associated , hence (a(x), b(x) ) = 1 If this he f would be 
linearly dependent on a(x) and b(x), say 

u(x) a(x) + v(x) b(x) — 1, 

then this equation must hold when x is replaced by any element of J , 
for x = 0, one gets — 2 u(0) + 2 v(0) which is obviously even and therefore 
=£ 1 Hence J[x~\ is noil -Euclidean 

Exercise Let a (x) and b(x) be two polynomials of E[x\, where E is 
a Euchdean domain , show that f(x) = a(x) u(x) -)- b(x) v(x) if and only if 
f(x) is divisible by a particular polynomial 

2-49 Comparison between R and R [a;] If the factorisation in D is 
unique, then the same holds for 

D l = DO,], D, = D,OJ, . , D n = D n _,[* n ] = DO, , , a;,,] 

Hence the polynomials in n variables form a uniquely faetorisable domain 
if the coefficients ran over a uniquely faetorisable ring, e g a field, of the 
domain J of the integral numbers On the other hand starting from a domain 
of this kind, one gets again integral domains by a homomorphism which 
maps a particular prime-element and its multiples on zero 

In 2-3 and 2-4, two methods have been developed to generate new rings, 
integral domains and fields from given ones , these two procedures are of 
a fundamental importance for geneial algebra One method consists of 
the construction of a ring of polynomials over a given ring, the other is a 
homomorphism by which a suitable subring is mapped on zero The 
following collection of results obtained before may be useful 


Let R be 

then R [x] is 

see 

a ring 

a ring 

2-33, th 1 

a commutative ring 

a commutative ring 

2-34, th 1 

an integral domain 

an integral domain 

2-34, th 2 

uniquely faetorisable 

umquely faetorisable 

2-42 

a field 

not a field, 

2-34, th 4 


but a Euchdean domain 

2-47, th 2 
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Moreover . If in D the factorisation is unique and D p is the subnng 
of the elements which are divisible by a particular prime-element p, 
then D is mapped on an integral domam Z> by the homomorphism mapping 
D„ on zero [see 2-46] It is not necessary that D" has a unique 
factorisation 

If in particular D = F[x], where F is a field, then D" is a field (see 
2-47, th 4) 

2-5 The fundamental theorem of General Algebra 

2-51 Existence of a root m a suitable extension Let a be any element 
of a field F, and / (x) be a polynomial of the ring F [x] From the al- 
gorithms of division, one gets 

/ (x) = (x - a) 4> (x) + /?, 

where <f> (x) is a polynomial of F [x] and /3 is an element of F By putting 
x = a, (see 2-35) one gets / (x) = /? , hence 

f (X) = (X - a) <1, (X) + f (a) ( 1 ) 

Therefore / (x) is divisible by (x — a) if and only if f (a) — 0 If m 
particular f (x) is an irreducible polynomial of a degree > 1, it cannot 
be divisible by any factor of degree I, and therefore there is no element a 
in F which satisfies the equation 

f(*) = 0 ( 2 ) 

To solve the equation (2), it is therefore necessary to consider a field F j of 
which F is a subfield Every polynomial / (x) of F [-r] is 'also a polynomial 
of F 1 [ar], but if / (x) is an irreducible polynomial of F [x], nevertheless 
it may be a reducible polynomial of F 1 [x] , in particular / (x) may be 
divisible by a factor x — a, where a is an element of f. In this case, a 
satisfies the equation (2) If F is a subfield of F r , then F t is called an 
extension of F, and if a satisfies the equation (2), then a is said to be a root 
of the polynomial / (x) To find a root of an irreducible polynomial / (x) 
of F [x], one has to extend F to a field-Fj which contains such an element a 
that / (x) — (x — a) /j (x), where f 1 (x) is a polynomial of F x [x] Let 
eg f be the field of the rational numbers , then / (x) = x 2 — 2 is irre- 
ducible, but / (x) is also a polynomial of F t [x] when F j is the field of the 
real numbers As a polynomial of F j [x], f (x) — (x — -\j 2) (x -j- yj 2) 
is reducible Let F 2 be t]ie field of the complex numbers, then x 2 -f- 1 which 
is an irreducible polynomial of F^ [x] is a reducible polynomial of F t [x]. A 
fundamental question of general algebra is whether, given a polynomial f(x) 
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of F [at], one can always extend F to F 1 in such a way that / (x) has a 
root m F t This question is answered in the affirmative by the following 
theorem 

Fundamental theorem of General Algebra If / (x) is a polynomial of 
F [x], then there exists an extension F, of F which contains a root a of 

of / (*) 

Proof Since / (x) is a product of (one or more) irreducible polyno- 
mials, and the roots of these polynomials are also roots of / (x), there is 
no loss of generality to suppose that / (x) — a 0 + a, x + • + x " is 

irreducible The classes of residues of F [x] modulo f (x) form a field 
F' and the classes characterised by elements a of F form a subfield which 
is isomorphic to F By this isomorphism the class + k (x) f (x) corres- 
ponds to the element <z 0 of F (see 2-47, th 5) One can therefore extend 
the field F to a field F l which is isomorphic to F\ Lst « be the element of 
F l which corresponds to the polynomial x, then ® 0 + a r a + • . + ® n 
corresponds to the class containing the polynomial a Q + a 1 x + . + a„ a:" 

= / (x), but this class is the class (0) Hence / (a) = 0 and a is a root 
of f (x) 


2-52 Extensions of F containing a root of f (a;) Let / (x) be a poly- 
nomial of degree n which is irreducible m F [x] Again consider the field F l 
which is isomorphic to the field of the classes of residues of / (x) in F [x] 
By f,' (x), polynomials of F [x] of a degree < n Will be denoted , then 
every class of residues of / (x) contains one and only one element of the type 
<t>, (x) and the elements of F t can therefore be represented by 

(«) =*K + b\ «+...+ &Vi « D ' 1 (1) 

Different polynomials <j, p (x) belong to different classes of residues and 
correspond therefore to different elements of F t Hence every element of 
F l can be represented in one and only one manner by (1) and it corres- 
ponds (1, I) to the ordered set 

^o> &1I • > &I1-J (2) 

of elements of F Those elements for which b t = . = 6 = 0, corres- 

pond to the elements of F To add (subtract) two elements of F v one has 
to add (subtract) the corresponding coefficients (2). The addition and the 
subtraction is therefore performed as if the elements were vectors, the corres, 
ponding holds for multiplication with elements of F, 
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Let now < pfa) and <p 2 (a) be any pair of elements of F 1 From 2-47, theorem 1 
it follows that by the algonthmus of division, a polynomial >p 3 ( x) satisfying 

<h( x ) <p 2 (x) - l.(x) f(x) = <j, 3 (x) (3) 

can be found ; hence 

0>(“) == ^i(") ( 4 ) 

can be obtained by a calculation composed of a finite number of elementary 
operations in the field F Let <p,{a) ^ 0, then ipfx) ^ 0 As / (x) is 
a primitive element of the ring F [a:], and therefore relatively prime to 
x ), the highest common factor (f{x), ^(x) ) — 1 can be obtained by 
the algorithms of the he f Hence a polynomial <pfx) of degree < n 
satisfying 

'/')(*■) >M-0 + h*) f{*)= I ( 5 ) 

can be found by elementary operations in F, and 

1 'hi 01 ) = 0 *( a ) ( 6 ) 

The considerations leading to the fundamental theorem enable us there- 
fore to extend the field F to a field F l which contains a root of f(x) and in 
which the elementary operations (addition, subtraction, multiplication, 
division) can be performed by methods based only on the elementary opera- 
tions in F Thus if F is given in the sense that one can perform the elementary 
operations in every case by a finite number of steps, the same holds 
for F l On the other hand F, is not uniquely determined, but in the sense 
of isomorphism only Obviously there are extensions of F which contain 
a root of / (x) and are non-isomorphic to F v e g every extension of F t 
has that property It will be shown now that every extension of F which 
contains a root, say a of / (x) contains a subfield which is an extension of 
F, and is isomorphic to F l 

Theorem Let / ( x ) be an irreducible polynomial of F [x], and E 
be any extension of F which contains any root, say a of f ( x ) , then the 
meet of all the subfields of E which contain F and a, form a field which is 
isomorphic to the field F\ of the classes of residues of f(x) m F[x] 

Proof Every subficld of E, which contains F and a, contams also the 
elements 


r n-l 


on n -•-» 


<p( a ) = b„ -f b, a + - . + 6 nn 


( 7 ) 
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where n is supposed to be the degree of f(x), and the coefficients run in- 
dependently over all the elements of F, and again denotes a poly- 
nomial of degree < n. Obviously the elements (7) form a module M Let 
0 1 (a), <p,(a) be two elements of M , if $. d (x) is determined by (3), then 0 3 (a) = 
0j(a) Hence M is a ring which is homomorphic to the field F\ As 

has been proved in 2-23, the module M is a field and is isomorphic to F\ On 
the other hand, every subfield of E containing F and « must contain M, 
thus it is the meet of all the subfields of E which contain M Hence the 
theorem 

2-53 Factorisation of f ( x ) into linear factors General remarks It 
has been shown m 2-52 that the fundamental theorem of general algebra 
is far more than a mere theorem on existence of roots Fields in which those 
roots exist, can actually be constructed, and the operations of addition, 
subtraction, multiplication and division can be performed practically in the 
extended field By the last theorem it has been shown that if the polyno- 
mial / (x) under consideration is irreducible m F (t), there exists one 
extension F 1 of F which is uniquely determined m the sense of isomorphism, 
such that every extension of F containing a root of / (x) is isomorphic to 
an extension of F t Obviously there exists an infinite number of exten- 
sions of F which are isomorphic to F\, one can even arrange them in such 
a way that they have no common elements othei than elements of F In 
each of these fields there exist loots of / (x) It ib theiefore of no use to 
speak of the roots of f {x), but onlv of the loots of / ( i) in a particular 
field About the number of roots of a polynomial of degree n in a field, 
the following theorem holds 

Theorem Suppose / (x) to be a polynomial m F [ x ] of degree n 
1. If F' is any extension of F, then /(a) has not more than n mots in F' 

2 There exists an extension, say F* of F, such that in F, f(x) — a{t — u,) 

(x — a„), where a belongs to F*, and a,, , belong to F, and this 

representation of / (x) in F*fx] is unique up to an arbitraiy permutation of 

“l> > “n 

Proof 1 Let a v , « m be roots of / (x) m F' Then / (x) is di- 
visible by x — a, where j = 1, , m , since the factorisation in F'[x\ is 
unique, f (x) is divisible by (x — a 5 ) (x — <* m ) which is a polynomial 
of degree m Hence m ^ n 

2 If n — 1, then the proposition holds for F*= F Suppose by 
mathematical induction, that the proposition holds for degrees < m, and 
let n = m. From the fundamental theorem it follows that F has an ex- 
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tension F x winch contains a root, say a ro , then / (x) is divisible in F-Jx] 
by x - say f (x) =/ x (x) (x - « m ) Now / t (x) is of degree m — 1, 
and from the supposition made for mathematical induction, it follows that 
f^x) — a (x — a L ) . (x — <*„,_,), and therefore /(x) = a(x — «,) . (x — a m ) 
The uniqueness of the representation follows from the uniqueness 
of the factorisation of / ( x ) in F*[ x ] Hence the proposition follows 
by mathematical induction 

By this theorem it is shown that the number of the roots of a 
polynomial does not exceed the degree in any field and that it is equal 
to the degree (certain roots possibly being counted multiply) in a suit- 
ably chosen extension It is however often of a great interest to 
know the exact number of the roots in a partieulai field, e g m the field of 
the real numbers 01 in the field of the complex numbers These problems 
are not solved by the fundamental theorem of general algebra, they need 
investigations of a special kind For the field of the complex numbers, 
the question is completely solved by the “fundamental theorem of classical 
algebra” Methods of detei mining the number of the real solutions of any 
polynomial with real coefficients will be given in 5-2 The fundamental 
theorem of general algebia and other theorems derived from it aie of a more 
general natuie than the classical investigations on real and complex roots 
In the general theory one does not suppose that the coefficients of the poly- 
nomial are numbers , they may be elements of any field, e g they may be 
polynomials in an indeterminate y with complex coefficients 

fjy) f- f,(y)* ~ t- /Jy)-*" = y) (1) 

Thus it follows fiom the fundamental theoiem of general algebra, that 
a suitable extension of the quotientfield of the polynomials f (y) contains 
an element a, for which F (a, y) = 0 This consideration is the very 
basis of the theory of algebraic functions On the other hand, it is obvious 
that the field of the real numbers and the field of the complex numbers are 
of a special interest as they play an important role m nearly every branch 
of mathematics Thus the classical question for the roots of a polynomial 
in these fields is of general importance far beyond its historical interest. 
In enquiring about the roots m the field of the real (the complex) numbers, 
one is m general not satisfied to know the number of the roots m the inter- 
vals (the domains) of the real axis (the complex plane) By choosing these 
intervals (domains) suitably small, one gets approximately values of the 
roots To determine the “magnitude” of a root, means nothmg else than to 
construct a way leading to an approximation of the root with an error less 
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than any predetermined number Of course, an irrational number cannot 
be determined otherwise than by a method approximating it by rational 
numbers Even the most familiar formulas expressing irrational numbers 
(e g -\J 2) are only rules for an approximation given m a short form. The 
importance of an approximative calculation of the roots is obvious, especially 
for problems of applied mathematics On the other hand, if the elements 
of a field are not put into a definite order, there are no intervals or domains 
m that field For this reason, the question about the magnitude of a root, 
does not arise at all m general algebra 

Thus the pioblem “to solve an algebraic equation” needs for itself some 
more detailed specification If one wants to find a root in any suitable 
extension, then one has to apply only the methods of 2-52 Again, if the 
roots in any particular field are required (e g the field of the leal numbers, 
or of the complex numbers), then methods appropriate to the special nature 
of that field are necessary For the field of the complex numbers, the “fun- 
damental theorem of classical algebra” (see 3-8) gives the most important 
infoimation Sometimes, the problem is put also m a different way “Is 
there a root of the given polynomial in a suitable field out of a particular 
class of fields To this class of problems belongs the investigation of 
those roots which can be expressed by a finite number of radicals (square 
roots, cubic roots, etc ) Again an interesting special case concerns those 
roots which can be expressed by a successive drawing of square roots , 
every problem of planimetry w hich can be solved by the help of ruler and 
compass leads to a root of this class 

2-6 Extension of a field 

2-61 Vcctorspaces over a field The representation of the extension 
F, of F as performed in 2-52 reminds one of the vectorspaces considered 
in Chapter I , this similarity has already been mentioned m 2-52, The 
only essential difference is, that the coordinates are elements of F which is 
supposed to be an arbitrary field, whereas in Chapter I, the coordinates 
have been supposed to be numbers It has been mentioned on p 26 
that — except for 1-7 — the property of the coordinates “to be numbers” can 
be disposed of easily Thus a more general definition of a vectorspace will be 
given now Let M be a module, and F be a field , the elements of F are 
denoted by characters a, b, c, . and the elements of M by Greek charac- 
ters a, /?, 8, . The common nullelement of M and of F is denoted by 0. 

Suppose that the elements of M can be multiplied by the elements of F, 
the products being elements of M, and that for this multiplication, th® 
following laws hold 
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a(ba ) = (ofc)a 
a(a + P) = aa -j- a/3 

( 1 ) 

(a -f- 6)a = aa-\- ba 
1 a = a 

Then M is said to be a module over F Now 

0 a = ( c — c)a = ca — ca — 0 Similarly it follows from (1) that 
c 0 = 0, whether the factor 0 in cO is regarded as the zero-element of F 
or of M 

Let in particular M be a module over F. where there exists a basis of n 
elements of M 


> «2, . «u 

such that every element a of M can be represented by 

« = «!«!+ .-fa,, (2) 

and that 

c i + + c n “u = 0 implies c l = — c n = 0 , (3) 

then if is a vectorspace over F of rank n, and the elements of M are called 
vectors If 


a — a l + • + a n &„ = b, a 1 + . . -f b a a u , 

then 

0 = (a 1 — b,) a L -f -f- (a n — b u ) 

hence it follows from (3), that o, — 6, = , — a n — b u = 0 The repre- 

sentation of a vector of M by (2) is therefore unique Thus there exists a 
(1,1) -mapping of the vectors of M on the ordered sets of n elements of F 


(®i» . , a n ) 


(4) 


the addition of vectors and the multiplication of vectors and elements of 
F bemg determined by 


(«i, ...» On) + {b v 


c (a u . . 


•» K) = (®i + K> . .., o n + &„) 
■ > ®n) = {ca,, . , . , ca n ). 


( 6 ) 
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Hence 

Theorem Every vectorspace of rank n over F is isomorphic to the 
system of all the ordered n-tuplets (4), where the operations of addition of 
vectors and of multiplication of a vector with an element of F are determined 
by (5) 

In particular, a “vectoispace W of rank n over the field of the real (or 
the complex) numbers” is isomoiphic to the “vectorspace V of rank n” 
of Chapter I Every vector of W is represented by an w-vector (*,, , z„) 

of V But W is not identical with V, nor are the vectors of W identical with 
the n-vectois which represent them, as the latter ones are n-tuplets by de- 
finition Of course, if a different basis of W is used, the vectors of W are 
mapped in a different way on the n-vectors of V In consequence of the 
isomorphism between W and V, the formulas established for V can be 
applied to W, and it is often convenient to identify V and W On the other 
hand, whenever different representations of the same vector will be considered 
(e g in 6-2) it is necessary to distinguish between a vectoi of W and its re- 
presentation by an n-vectoi of V In a corresponding manner the notion 
of w-vector will be used in connection with vectorspaces over any field K 

A hneai transformation of a vectorspace W over K is a mapping of W 
on itself leaving invariant addition of vectors and their multiplication with 
elements of K This definition tallies with a characteristic pioperty of linear 
transformations of spaces V of n- vectors [see l-(ll), th 2] Thus £ ->- 
implies Sc, f, Sc, when c v , c m are elements of K In the same 
way as theorem 2 of 1 -( 1 1 ) has been proved, one shows easily that the 
n- vectors (x v , rj representing the vectors of W are transformed by 
linear equations 

x\=Xa\x k ( 6 ) 

with coefficients a\ out of K, when W is transformed by a linear trans- 
formation On the other hand, every transformation (6) corresponds to a 
linear transformation of W when any basis of W is selected 

2-62 Extension of the results of Chapter I to vectorspaces over an arbi~ 
trary field If F is the field of the real (or the complex) numbers, a vector- 
space over F is isomorphic to a vectorspace as considered m Chapter I of 
this book It was however a matter of convenience only that in Ch I 
the coordinates of a vector have been supposed to be numbers With the 
only exception of 1-7, no other property of “numbers” has been used than 
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that they form a field For this reason it was stated m 1-16 that the notion 
of number could be understood as real number or as complex number. The 
reader may check that 1-2 to 6, and 1-8 to 11 hold without any further al- 
teration if the notion of “number” is systematically replaced by "element 
of a field F” 

In 1-7, it has been supposed that “number” should mean “real number” 
This section forms a portion by itself ; for the methods used there, it is 
essential that 0 cannot be represented as a sum of squares This supposi- 
tion is not satisfied m fields of characteristic p, and not even in every field 
of characteristic 0 The condition holds in the field of all the real numbers 
and in every subfield of it, but it is not satisfied in the field of all the complex 
numbers The main-result of these considerations will be stated now as a 
theorem 

Theorem Given any field F, then the investigations of 1-2 to 6 and 
1-8 to 11 hold without any fuither alteration, if the notion of vectorspace is 
icplaced by “vector-space V over F", w-vectoi by vector of V, and if the co- 
efficients of the linear equations, the coordinates of the matrices and the 
terms of the deteiminants are supposed to be elements of F 

Eg A vectoi « is considered to be dependent on the vectors /?,, , ( 8 m 

if a — b , /?, + + K holds, and the vectois /3 V , /?„, are indepen- 
dent if c x /8 1 -f +- c m /2 m — 0 implies c 1 — = < m = 0 The rank n 

of a vectors-space V is equal to the maximum number of independent vectors 
m V, and n is therefore independent of the choice of the basis of V 

2-63 Finite extension t> If a module R ovei a field F is itself a ring, 
then R is said to be a ring over F In chapter VI, rings of matrices will 
be considered which are rings over the field of the coefficients of the matrices 
A special case of great interest is when F is a subring of the ring R over F 
In this case the unitelement of F is also the umtelement of R 

Let K be an arbitrary field, and let x be an indeterminate (the letter x 
not bemg used for denoting elements of K), the polynomials in x with coeffi- 
cients of K form an integral domain K[x] which is a ring over K, and con- 
tains K as a subring This ring is not a vectorspace, since the powers of x 
form an infinite set of independent elements , thus there exists no maximum 
number of independent elements m K [x] and therefore K [x] has no basis 
The quotientfield of K [x] will be denoted by 
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This field is an extension of K, but it is not a vectorspace over K since it 
contains an infinite number of independent elements Every extension of K 
which contains x, m'ust contain every element of K{x), and every ring over K 
which contains x must contain every element of K [*]. This notion can be 
generalised when x is an element which is not necessarily an indeterminate 
Let A be an extension of K, and a be any element of A If / (x) runs 
over all the polynomials of K [x], then the elements / (a) of A form a ring 

K far], (2) 

and its quotientfield will be denoted by 

K (a) (3) 

Hence K [a] is the meet of all the rings containing K and a, similarly K(<*) 
is the meet of all the extensions of K winch contain a The correspondence 
by which f (x) is mapped on j (a), when f(x) runs over K [x], is a homomor- 
phism , thus K [a] is homomorphic to K [x] If the field A is a vector- 
space of rank, say n over K. — l e if there esixts a maximum number n of 
independent elements in A — . then A is said to be finite over K The rank 
n is denoted by 

n = [A K], (4) 

In this case, there exists in A a basis «, , , a„ of A over K such that 

every element a of A can be represented in one and only one manner by 
a, a, + + a„ a„ by the help of elements a v a„ of K If an ele- 

ment a of A is a root of a polynomial of K [x], then a is said to be algebraic 
over K, otherwise a is transcendental over K If every element of A is 
algebraic over K, the extension A of K is said to be algebra^ over K The 
interconnection between these notions is shown in the following theorem 

Theorem 1 If A is finite over K, it is algebraic over K — 

2 If « is algebraic over K, then K (a) is finite (and therefore algebraic) 
over K, and it is isomorphic to the field, formed by the classes of residues 
of the irreducible polynomial / (x) of K [x] of which a is a root , further- 
more K («) = K[a] The rank [K(a) K] = n which is equal to the degree 
of / ( x ), is said to be the degree of a over K , 1, a, . a form a basis of A 
over K — 3 If a is transcendental over K, K [a] is isomorphic to K [x], 
and therefore it is not a field 

Proof (1) Let A be finite over K, say of rank n, and let ft be any 
element of A Then 1, /?, , /?'“ cannot be independent , hence a relation 

Co + c, j8 4- • + c„ j8“ = 0 holds, where the coefficients c 0 , c l( , . . , c„ belong 
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to K, and at least one of them is different from 0 Therefore, e 0 -j- c x x -f- 
+ c„ x n = f ( x ) is not the polynomial 0 and / (/?) = 0 Hence every ele- 
ment (3 of A is algebraic over K, and therefore A is algebraic over K 

(2) Let A be any extension of K, and « be an element of A which is algebraic 
over K Then theie exists a polynomial $ ( x ) in K [x] such that <p (a) = 0 
<p (x) is a product of irreducible factors <p (x) = / (x) f , (x) f m (x). 
Therefore / (a) f, (a) f„(a) = 0, and as the factors on the left hand side 

are elements of the field A, one of the factors is zero, say / (a) = 0, where 
/ ( x ) 19 irreducible and of degree, say n The ring K [a] is homomorphic to 
K [x] The Bubrmg R of K [x] which is mapped on the zero-element of 
K [a] contains / (x) and all the polynomials divisible by it, but no element 
of K besides 0 Since K [x] is a Euclidean domain, R contains the h c f 
of any two of its elements, and since R does not contain 1, it cannot contain 
any element which is relatively prime to / (x) , thus R contains those and 
only those elements of K [.r] which are divisible by / (x) Hence two 
elements of K[x] are mapped on the same element of K[a] if and only if their 
difference is divisible by / (x) K [a] is therefore isomorphic to the ring of 
the classes of residues of / (x) in K [x] Since / (x) is irreducible in K [x], 
this ring is a field (see 2-47, th 4) Hence K[«] K (a) and [K(«) K] 
= n In every class of residues there exists one and only one polynomial of 
degree < n say 6„ + 6, x -f + & u _, (see 2-47) The elements of K (a) 
can therefore be represented in one and only one manner by b a + b x a -f . . 
+ b u , i «“■’ Hence 1, a, , a" 1 is a basis of K (a) over K 

(3) Let a be transcendental over K K[>] is homomorphic to K[x] The 
ring R of the elements of Kfx] mapped on 0 does not contain any element 
of K other than 0 Let R contain a polynomial / (x) of degree > 0, then 
/ (a) = 0, and this implies that a is algebraic over K Since a is supposed to 
bo transcendental, R contains the element 0 only, and therefore the homo- 
morphism is a n isomorphism Hence the theorem 

2-64 Rank of a field over a field 

Theorem Let A be a finite extension of K, and M be a finite exten- 
sion of A, then M is a finite extension of K, and 

[M K] = [M A] [A K] (1) 

holds 

Proof Let «j , . , »„ be a basis of A over K, and . . . , be a 

basis of M over A, then n = fA K] and m = [M . A) Any element y of M 
can be represented by y — /?„ where . . , A m are elements of A and 

69 0. P —14 
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therefore X t =2^ c'j aj, where each c*j is an element of K Hence y =Jc'j a j Pi- 
To prove that the n m elements of aj /?, form a basis of M over K, one 
has therefore to prove only that they are independent Suppose now 
0= S d'j aj /?j = X P\ 2 d‘, aj Since m the last sum, the coefficients of (i v 

i> j t j 

are e.ements of A, and /?,, , /3 rn are independent, these coefficients are 

equal to zero Similarly X d\ = 0 implies d [ j = 0 Hence the n m ele- 
ments are -ndependent and form a basis of M over K Hence the theorem 

Corollaries 

1. If A is a siibfield of M and an extension of K, and M is finite over K, 
then M is finite over A, A is finite over K, and (1) holds 

Proof As M is finite over K, say of rank q, and the elements of A 
belong to M, no set of more than q elements of A can be independent over K 
Hence A is finite over K Similarly every set of more than q elements of M is 
connected by a linear homogeneous equation, the coefficients being elements 
ofK —and therefore elements of A— of which at least one is different 
from 0 Hence M is finite over A, and (1) follows from the above theorem 

2 Jf [a K] — q, then the degree over K of any element a of a is a factor 

of q 

Proof [A K (a)] [K(a) K] = q 

3 If (x) is a polynomial of Kfr], and [K(a) K] - q, then [K (</>(<*)). K] 
is a factor of q 

4 If [K(a) K] = p is a prnnenumber, and 0 < degree <p (x) < p then 
K(*(«)) = K(«) 

Proof From the inequality it follows that tf> (a) is not an element 
of K Hence [K (<t> (a)) K] > 1 and it is therefore equal to p. Hence 
[K(a) K(f/>(a) )] = 1 Hence the pioposition 

2-65 Highest common factor and extension of a field If F' is an ex- 
tension of F, and f l (x) is a polynomial of F [x], then it is also a polynomial 
of F'[x] If a polynomial / (x) of F [x] is a factor of /,(x) m the ring 
F [x], It is also a factor of ffx) m F' [ar] , if on the other hand / (x) is 
a factor of /,(x) in F' fx], one gets the quotient of the polynomials by the 
algonthmus of division , these coefficients therefore belong to F x , hence 
/ fx) is divisible by / (x) also m F [x] However ffx) may have factors 
which are polynomials in F' [x] without belonging to F [x] Let ffz) 
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and f t (x) be two polynomials of F [x] The highest common factor 
(/,(x), / 2 (x)) can be calculated by the algonthmus of the kef its coeffi- 
cients are obtained by rational operations and belong therefore to F ; 
a common factor of those coefficients which is any element =£0 of F, remains 
arbitrary If one considers f x (x) and f 2 (x) as elements of F'[x], then 
the algorithmus furnishes the same polynomial ( /, (x), f. 2 (x) ), but a common 
factor of the coefficients remains arbitrary which is an element -=f= 0 
of F'. Hence 

Theorem, Let F' be an extension of F, and f x (x), f 2 (x) and / (at) 
be elements of F[x ] A highest common factor of f x (x) and / 2 (x) in the 
ring of polynomials F [a;] is also a h c f of those polynomials in F'[x]. 
If / (x) is a factor of f x (x) in F' [x], it is also a factor of f x (x) in F [a-] 
and conversely. 


2-66 Multiple roots Let a be a root of a ploynomial / (x) of K [a:] 
and let K (a) — A, then f (x) can be represented in A [x] by 

/ (x) = (x - a) f x (x) 

Hence /(x) is divisible by (x — a) 2 if and only if /,(<*) — 0 Denoting the 
derivatives in the usual manner 


and therefore 


f'(x) = /,(x) (x - a) f\(x) 


/'<«) = /i(“) 


Thus f'(a) = 0 is the necessary and sufficient condition for /(x) to be divi- 
sible by (x — a ) 2 If 


/(x) = X flj x 1 , then f'(x) = X J a, xJ- 1 


Except for the case that /(x) is 0, the degree of f'(x) is less than the degree 
of /(x) It may be remembered that the factor j means a sum of j terms, 
each being equal to the umtelement 1 of K (see 2-26, (3) ) , thus if the charac- 
teristic of K is zero, j = 0 implies j =0, but if the characteristic is p, the ele- 
ment j is equal to zero if and only if j is divisible by p Hence m the case of 
a characteristic p ^0 the degree of /'(x) may differ by more than 1 from 

the degree of f(x) Especially f'(x) — 0 if j a } = 0, for j = 0, 1 n In 

the case when the characteristic is a primenumber p, this condition means 
that only the coefficients of terms x m can be different from zero One can 
formulate this result in a manner which holds for both the oases 



10S 


ALGEBRA I 


Theorem 1 f'(x) = 0 if and only if 

f(x) = $ (*"), (1) 

where m is the characteristic of K. 

Indeed for m = 0, the condition means that f(x) is of degree < 1, whereas 
for pnmenumber characteristic, the theorem has been proved just before. 

Let f(x) be irreducible and of degree > 0 , consider the h c f 
(f(x), f(x)), then two cases have to be distinguished 

(1) /'(*) ¥=■ °> then (/(*)> /'(*)) = 1 
(2) f(x) = 0, then (f(x), /'(*)) = f(x) 

If again a is a root of / (x), then f'(a) =/= 0 m the first case and /'(<*) — 0 in 
the second case Hence 

Theorem 2 Let f(x) be an irreducible polynomial of K[x], and let a 
be a root of it Then f(x) is divisible by {x — a) 2 if and only if K has 
a pnmenumber characteristic p, and f(x) is a polynomial in a;” over K. 

If f(x) is not divisible by a factor (x — «) 2 in any extension of K, then 
it is said to be separable, otherwise non-scparable Ineducible polynomials 
over a field of characteristic 0 are therefore separable, whereas irreducible 
polynomials over a field K of characteristic p are non-scparable if and only 
if they belong to K [x 1 ’] 

2-67 Non- Separa ble Polynomials Let K be a field of characteristic 
p, and f(x) be an irreducible polynomial over K, and of degree n Then 
there exists a uniquely determined integral number e > 0, such that f(x) 
belongs to K[x^ r ], but not to K [x»’’ +1 ] Thus f(x) is separable if and only 
if e = 0 At any rate 


f(x) = 1* (x"') (1) 

The polynomial <p(y) of K[y] is irreducible as \j/(y) = 4> 1 (y) i>-Ay) implies 
f{x) = i/' 1 (x p< ’) </' 2 (x pt ) Moreover <p(y) cannot belong to K[y p ], otherwise 
f(x) must belong to K[x pe * 1 ] Hence ^(y) is irreducible and separable. 
Let q be the degree of i>(y), then 

n = q p e . 


In a suitable extension A of K, the polynomial ^(y) can be represented by 

Hy) = «(«/-&) • • (y-&). (2) 
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where /3j . . . , ( 5 Q belong to A, and a belongs to K Hence 

f(x) —a (x»° - /?,) (xv° - &) (3) 

Since <p(y) is separable, /?,, , j3 q are q different elements 

In a suitable extension M of A, there exist elements such that 

y/ = ^j, for; = 1, , q (4) 

Since the characteristic of M is p (see 2-26, (2) ), 


(x — y J )p f ‘ = xp e — yjP 6 = xv e — /8j 

Hence 

/(*) = «[(* - 7i) (* - y«)] pfl 


(5) 


Since the q elements /3 } are different, it follows from (4) that y,, , y q are 

different Hence 


Theorem An irreducible polynomial f(x) over a field K of charac- 
teristic p which is of degree n, has exactly q = n p e different roots in a suit- 
able extension of K and it can be represented by (5) The integral number 
e 5: 0 is uniquely determined by the condition that. f(x) belongs to Kfr"'], 
but not to Kfzp”* 1 ] 


2-7 Repeated extension of a field 

2-71 Extension of afield to a ring and to afield by a finite number of 
steps Let K be a subfield of A, and let 

®1. «-•> , »m (1) 

be elements of A Denote 

K{«,) = K 1( K,{«,) = K a , . , K m .,(a m ) = K m (2) 

Thus Kj is the meet of all extensions of K which contain a 1 (see 2-63), and 
K a is the meet of all the extensions of K, which contain ar 2 Hence K 2 is an 
extension of K containing and a 2 On the other hand every extension 
K' of K which contains a L and a 2 is an extension of K,, therefore K 2 is a 
subfield of K', hence K 2 is the meet of all the extensions of K which contain 
<*! and a 2 Similarly K ra is the meet of all those extensions of K which con- 
tain all the elements (1) One therefore denotes 

K m = K(«], . , cr m ), (3) 

where the elements in the bracket can be interchanged among themselves 
arbitrarily 
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Let f(x j ,x m ) run over all the polynomials of K[x„ 

(see 2-36), then the elements f(a u . . , a m ) form a nng 

K[«x « m ] (4) 

which is homomorphic to Kfo, . , x m ] and which is contained in K m . The 
quotientfield of (4) is also contained in K m Since on the other hand the 
quotientfield is an extension of K containing the m elements (1), it must 
be identical with K m Hence every element of K m can be represented m 
the form 

f( a it ft( a l) • •> a m)> (5) 

where the numerator runs over all the elements of (4), and the denomina- 
tor over those elements of (4) which are different from zero Though 
K[ , a m ] is homomorphic to K[ x u . , x,„ ], the quotientfield 

K(“i. i «m) is in general not homomorphic to K(a:i, . . , x m ) as a field 
cannot be homomorphic to a field unless they are isomorphic Of course, 
the correspondence 

/(■l-li i * 111 ) f{ a 11 • 1 “in) 

cannot be extended to the correspondence 

/ (*11 ■ • 1 <P( x u 1 x m) f( a u • > a m) <p( a 11 1 <*m) 

if there are polynomials <p which are different from the null-polynomial and 
for which nevertheless <f>{a u , a ro ) = 0 holds. If there is no such poly- 
nomial, then K(r 1 , . . , x m ) and K(ai, . ., a m ) are isomorphic. 

Theorem If a lt . .,a m are algebraic over K, then 

,. a m ) — K[a lt , a m ] 

Proof (by mathematical mduction) For m = l, the theorem has 
been proved in 2-63 Suppose K[a,, . . , a m .j] = Kfaj, ..,a m _ 1 ) = A 
Then a m is algebraic over A, and K[«i, . , a m ] = AfaJ == A(a m ) = 
KK, • • i ®m) holds 

Exercises Let R be the field of the rational numbers 

(1) Consider A = R(\i 2, 3) Construct a basis of A and 

show that A == R(y/ 2 + ^ 3) 

(2) Investigate R(y 2, %j 2) 

2-72 Primitive element of an extension If 


A = K(a), [A : K] > 1, 
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then a is said to be a primitive element of the extension A of K. The examples 
given in the exercises just above show that even an extension generated by 
more than one step may have a primitive element As a matter of fact, 
a finite extension of a field of characteristic 0 can always be performed by 
the help of a primitive element The corresponding statement holds for a 
large class of fields of characteristic p; that this class includes all the finite 
fields, will be shown later m 3-21 by a different method To investigate the 
case where K has an infinite number of elements, the following lemma is 
used. 


Lemma Let f(x) and g{x) be polynomials of K[x] and let in a 
suitable extension of K, 

fix) = (x- a,) . (x - a,.), g(x) = (x - £,) . (* - p m ) , (1) 

if c is such an element of K that for l = 1, , n and k = 2, . , m the 

n(m — 1) inequalities 

y = «i + c ^ «i + c ft (2) 

hold, then 

K(y) = K(a,, pp (3) 

Proof Put K(y) = K' , then ?(x) = / (y — c x) is a polynomial of 
K'[x] which has a root /?, in common with g(x) If /?„ were a common 
root of <i>(x) and g(x), then y — c /? k = a,, contrary to (2) Hence the 
h c f (<p(x), g(x) ) = x — p u and therefore /?, is an element of K' Further- 
more a, = y — c Pi belongs to K', thus every element of K(a,, p t ) does 
Since on the other hand y belongs to K(a 1( p } ), the lemma follows. 

Theorem Let K be a field containing an infinite number of elements, 
let a be algebraic over K, and p, . k be roots of separable polynomials of 
K[x], then there exists a primitive element A for the extension K(a, /?,<•., k) 
of K 


Proof At first will be proved that K(«, p) = K(a') Put a = a u 
P — p u and let f(x) and g(x), as represented by (1), be the irreducible 
polynomials in K[x] with the roots a and p respectively Since g(x) is 
supposed to be separable, the roots p u /?,, . , p m are all different. To 
determine a', one has to find out an element b of K such that 

a’ — oti + b p t «i + b Pt for i = 1, . , n, k = 2, . . , m. 

These conditions are satisfied, if 6 is not a root of anyone of the linear 
equations 


(«i — «i) + x iPi ~ Pk) = 0. 


( 4 ) 
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Since p 1 — p k 0, each of these equations has exactly one solution m a 
suitable extension of K, and therefore at most one solution in K Now K 
contams an infinity of elements, hence there exists an element imK which 
does not satisfy anyone of the n(m — 1) equations (4) Hence «' is a 
primitive element of the extension K (a, /?) of K Since a and /? are 
algebraic over K, the extension K(«) is finite over K, and K(a') = K(a, (3) 
is finite over K (a) and therefore finite over K Hence a' is algebraic over 
K Thus A = K(a', . , *), where a' is algebraic over K, and the other 
elements in the bracket arc roots of separable polynomials The procedure 
can therefore be repeated, till the number of the elements in the bracket is 
reduced to one element 


A = a f- b /3 + k k 

K(a, /?, , k) — K(a) 

If in particular, the characteristic of K is 0, then K has an infinite number 
of elements, and every irreducible polynomial is separable Thus one gets 
immediately the following corollary 

Corollary If K is a field of characteristic 0, and a, /?, , k 

are algebraic over K, then there exists a (primitive) element A. such that 
K(a, p, . ,«)= K(A) holds. 

2-73 Extension by roots of two different irreducible polynomials In 
2-71 and 2-72, such extensions of a field K have been considered which are 
generated by elements a, (3, of any field of which K is a subfield In 
an earlier section extensions of a different kind have been used already 
One can extend a field K to a field K(a) where a is not given, but has to 
be created in such a way that it is a root of a polynomial f(x) of K[x], 
It has been proved m 2-51 that this extension is always possible and in 2-52 
it was shown that if f(x) is irreducible, the extension is determined uniquely 
m the sense of isomorphism 

The first statement can be generalised without difficulty to the case 
of more than one polynomial Given polynomials ffx), f 2 ( x ), , f k (x) 
in K[x], then one can construct by repeated extension a field K(a, j3, . , k) 
such that 0 = /,(«) = /,(/?) = . = /„(*) Let now the k polynomials be 

irreducible in K, then K(a) is determmed uniquely in the sense of isomorphism 
but ffx) — though irreducible in K— may be reducible in K(a) Let 
/ 2 (*) = <h( x ) M*)> furthermore let (3 be a root of and /?' a root of 

<p 2 (x), then it is not certain whether K(a, j3) and K(a, j3') are lspmorphiq 
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or not. That both the cases occur is shown by the following examples on 
extensions of the field R of the rational numbers 

1 /» = ~ 2, f 2 (x) = z 4 - 2. 

a 3 = 2, fjx) = (x 2 — a) ( x- -f a) , J?(a) = Kj 

P 2 = a, f3’z = -a , K.G8) = R(p), KW) = R(fi'). 

Hence R(a, /?) and R(a, /3') are isomorphic It may be mentioned that 
these two isomorphic fields of numbers are different fields , the first is a field 
of real numbers only, whereas the second contains complex numbers 

2. fi(x) = *• — 2, f 2 (x) = x 0 - 2 , 

= 2, fi(x) = ( x 2 - a) ( x 4 + jR(a) = K, 

P 2 = «, , K 1 (^)=K 2> K 1 (/3')=K' 2 

K 2 is composed of real numbers , hence 0 cannot be represented as a sum of 
squares of elements of K 2 which are different from 0, nor can 0 be represen- 
ted in that manner in any field which is isomorphic to K 2 In K\ however 

(2/?' 2 + «)- + ft2 4- a ~ 4* «' — 4(/? M 4* ttyS'* 4- or) = 0 

holds Hence K\ is non-isomorphic to K, Thus it is possible that a field 
K can be extended to two non-isomorphic fields by roots of the same two 
polynomials both irreducible in K 

2-74 Normal extension of a field A case of special interest is when a 
field K is to be extended by n different roots of one polynomial of degree n 
In this case, a theorem of uniqueness (m the sense of isomorphism) holds, 
and one is led to the not mal extensions of a field which play a very important 
role m algebra For these investigations the following three definitions 
will be needed 

Definition 1 Let I' be an isomorphism mapping a field K on a field A, 
then a subring R of K is mapped on a subnng L of A This mapping of 
R on L is an isomorphism of R to L, say the isomorphism 1 Then the 
isomorphism /' is called an extension of the isomorphism / 

Often the problem occurs of finding an extension /' to a given isomor- 
phism I 

Definition 2 Let fix) be a polynomial of K[a;], then there exists an 
extension M of K, in which fix) is a product of linear factors Let 
a n be the roots of fix) m M, then fix) can be represented as a 
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product of linear factors only in those subfields of M, which contain 
KK, , «„) '1 ’his field is therefore said to be a smallest extension of K 

admitting the complete reduction of f(x) 

Theorem 1 Let K and A be isomorphic, f(x) and <u(r) be corresponding 
polynomials of K[x ] and A[r] Let K' be a smallest extension of K admitt- 
ing the complete reduction of f(x), and let A' be a smallest extension of A 
admitting the complete reduction of qfr) , then every isomorphism I of K 
and A can be extended to an isomorphism /' of K' and A' 

Proof The theorem holds obviously if [K' KJ = 1 since in this 
ease K = K', and A = A' To prove the theorem bv mathematical induc- 
tion, it will be supposed to hold for [K' K] < m Let [K' K] = m, and 
ffr) be an irreducible factor of f(x) of degree > 1 Let (/^(x) be the poly- 
nomial of A [ar] isomorphic to/,(.r), let a be a root of ffx) in K', and let (3 
be a root of in A' We can extend the isomorph'sm 1 to an isomor- 
phism of the classes of residues of ffi) in K\x] and of ipfx) m Afx], and 
therefore to an isomorphism /, of K(« 4 and A(/3) Every extension of K(a) 
admitting the complete reduction of fi(x) is an extension of K admitting the 
complete leduction of /(a) , hence K' is a smallest extension of K(a) admitt- 
ing this reduction For the same reason A' is a smallest extension of A (/?) 
admitting the complete reduction of >l>(x) Ab [K' K(»)] < m, the isomor- 
phism 7, can be extended to an isomorphism 1' of K and A', and since /' 
is an extension of I, the theorem holds 

This theorem can be applied also to the cases when K and A are identical, 
and 1 maps every element of K on itself So one gets the following impor- 
tant corollary 

Corollary Any two smallest extensions of K admitting the complete 
reduction of a polynomial f(x) of K[r] are isomorphic The isomorphism 
can be chosen in such a way that every element of K is represented by 
itself 

Definition 3 Let N be an extension of K with the property that every 
irreducible polynomial of Kfx] which has a root in N can be represented m 
Nfa;] as a product of polynomials of degree 1, then N is called a normal 
extension of K 

Theorem 2 If K' is a smallest extension of K admitting the complete 
reduction of an arbitrary polynomial f(r) of K(V|, then K' is a normal ex- 
tension of K 
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Proof Let a u , a„ be the roots of f(x) m K', let g(x) be irreducible 
in K, let ft and ft' be roots of g(x), and let ft belong to K' It will be 
shown that ft' also belongs to K' If not so, then ft' belongs to a suitable 
extension of K' Therefore K t — K(ft) and K, = K(ft') are isomorphic, 
and there exists an isomorphism 7 of these fields by which every element 
of K corresponds to itself, and ft corresponds to ft' From theorem 1 it 
follows, that one can extend 7 to an isomorphism 7' of Kj(a,, . , a„) and 
Kj(aj, . , a„) = K(<*i, , «„) — K' By 7' every root of a polynomial 

with coefficients from K will be represented by a root of the same 
polynomial , hence the elements a, will only be interchanged Put ft — 
F(a,, , «„), where the coefficients of F are elements of K Hence ft' = 

F(a k , a h ), and therefore ft' belongs to K' Since ft' is supposed to 

be an arbitrary root of g(x), the theorem holds 

*2-741 Generalisations of the theorem, on normal extensions Consider 
now two different generalisations of this theorem 

Theorem 1 Let f,(x), f i (x), be a sequence of polynomials of K[x], 
finite or infinite in number , the smallest extension of K admitting the com- 
plete reduction of all these polynomials is a normal extension of K 

Proof Let F m (x) = /,(*) f,(x) f m (x), and let K m be the normal 
extension obtained by extending K with the roots of F m (x), m — 1,2, 

Now K, is a subfield of K_„ again Ko a subfield of K,, and so on , the 
smallest extension of K admitting the complete reduction of all the f m (x) is 
the join of the fields K m , i e the set of all those elements which belong to 
any field K m , for m — 1, 2, This set is a field K* If therefore a 
polynomial g(r) has a root m K*, this root is an element of a suitable K m , 
and since K m is a normal extension of K, the polynomial g(x) is a product 
of linear polynomials in K, n and in every extension ofK m , in particular in 
K* Hence the theorem holds 

The second generalisation concerns the factorisation of g(r) in a field N, 
which is normal and algebraic over K when y(x) is irreducible in K[x] 
Let , ^ m (x) be irreducible factors of g(x) in N, let ft, be a root of 

^,(x) and ft., a root of ,(x) in a suitable extension of N The coefficients of 
'Pfx) are roots of polynomials f,{x), , fftx), irreducible in K As N is 

normal over K, the roots of every f,(x) belong all to N Let a„ , «„ be 
these roots , then K C K(a 1 , . , a„) — A C N A is a finite extension of 

K and normal over K , the factonsation of g(x) in A is the same as in N 

* May be omitted at the first reading 



lid ALGEBRA I 

As 0! and /J 2 are roots of g(x), irreducible m K, the fields K(/3, ) and K(/? 2 ) 
are isomorphic, and there is an isomorphism by which /?, is represented by 
/3 2 , and the elements of K do not change Hence no /, (a;) is changed by it 
We can extend this isomorphism to an isomorphism of K(/3,, a lt . a n ) 
= A(/3,) and K(/? 2 , a 1 , . , <*„) = A(j3 2 ) By this isomorphism, every /|(x) 

remains invariant , its roots are therefore interchanged only, hence an ele- 
ment of A is represented by an element of A The polynomial ^(x) which 
is irreducible in A and which is a factor of the polynomial g(x), irreducible 
m K, must therefore be represented by a factor of g(x) which is irreducible 
in A Since the root of ^(x) is represented by the root f) 2 of ^ 2 (x), 
the image of ^(x) is 'P-i(x), and as these two polynomials are arbitrary irre- 
ducible factors of g(x), the follow mg theorem holds 

Theorem 2 If g(x) is irreducible in K, and N is normal and algebraic 
over K, every irreducible factor of g(x) in N can be transformed into every 
other by a suitable automorphism of a certain field over K, hence these 
factors are all of the same degree 

If one of the irreducible factors is of degree 1, the others are also 
linear , so 2-74, theorem 2 is a special case of this theorem 

2-742 Automorphisms of a normal extension Let N = K(a 1 ) be a nor- 
mal extension of K, and [N K] = n Then 1, a 1 , . ., a, 0 " 1 form a basis 
of N over K, and therefore a, is the root of a polynomial of degree n which 
is irreducible in K[x], 

f(x) = a a + a L x + • • + a n . x x 0 ’ 1 + x n 

As N is a normal extension of K, and it contains a root of f(x), this poly- 
nomial is factorised in N[x] by 

f(x) = (x - a x ) (X — or 2 ) (X — a„) 

Since K(a,) is isomorphic to K(aj), so [K(aj) K] =n , furthermore aj and 
therefore every element of K(aj) are contained m K(a 1 ), hence [K^). K(a,)] 
= 1. Therefore 

N = K{«j) = . = K(a n ) 

holds There exists an isomorphism mapping K(a 1 ) on K(<*j) by which the 
elements of K remain unaltered This isomorphism is an automorphism of N. 
On the other hand, if there is an automorphism of N for which the elements 
of K remain fixed, f(x) is transformed into itself and therefore every root of 
f(x) is mapped on a root of /(x). The roots of f(x) are therefore undergoing 
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a permutation by the automorphism, and the automorphism is uniquely 
determined by the condition that a 1 is to be mapped on aj since this 
condition imphes that the basis 1, a lt , a, 11 ' 1 is mapped on the basis 
1, etj, , aj” -1 Hence there exist exactly n automorphisms of N for which 
the elements of K remain invariant For n > 2, not every permutation of 
the roots of f(x) corresponds to one of these automorphisms There exists 
one and only one automorphism of N for which the elements of K remain 
invariant and a particular ioot of f(x), say a v is transformed into any parti- 
cular root, say a, For n = 1 this automorphism is the identity If in 
particular K is of characteristic 0, and M is a finite and normal extension 
of K, then there exists a primitive element <*[see 2-72] such that M = K(a) 
and the above statements on automorphism hold 

Moreover, let K be an arbitrary field, [A K] = 2, and (i t be an element 
of A which does not belong to K Then A =■ K(/2,) [see 2-64, corollary 4] 
/?, is a root of a polynomial of degree 2, say f(x), which is irreducible m K, 
but reducible in A 


f(x) = (x - /3j) (x - p 2 ) 

Hence /?, belongs to A An extension of rank 2 is therefore always a normal 
extension Besides the identity, there exists one automorphism A of A 
for which K remains invariant A interchanges /?, and fi„, and since 1, /?, 
is a basis of A over K, one can express A by 

a = a + b /3 t -< — >- a b Pi — a, 

where a and b run over K The elements a and a are said to be conjugate 
Conjugacy is a symmetric relation, because every element is the conjugate 
of its conjugate The elements of K are the only ones which are self-con- 
jugate The remaining elements of A consist of pairs of conjugate elements. 



CHAPTER III 


GENERAL ALGEBRA, SPECIFIED THEORY 
3-1 Cyclotomic polynomials 

The following equation plays an important role in algebra 

- 1 = 0 ( 1 ) 

As 1 may be the umtelement of any field K, the polynomial on the left 
hand side of (1) can be considered as belonging to any ring of polynomials 
K[ z] To investigate (1), the nature of the field K has therefore to be taken 
into account If K is the field of the complex numbers, then the solutions 
of (1) are 


e- yn ‘ u = cos 27 r * -f i sin 27r £ , (2) 

for k = 1 , , n 

By representing these points m the complex plane in the usual manner, 
one gets n points which subdivide the umtcircle into n equal arcs The 
problem of partitioning a circle into congruent arcs, leads therefore to the 
equation (1) Here, the equation will be considered from a purely algebraic 
point of view 

3-11 Reduction of the problem to the case when n is not divisible by the 
characteristic For f(x) = x n — 1, f'(x) = n x B ~‘ The highest common 
factor is therefore found to be 

( f(x), f'(x) ) = f(x) if n is divisible by the characteristic of K (1) 

= 1 if n is not divisible by the characteristic of K 
In the 2 nd case, f(x) has n different roots 

Suppose at first that n is divisible by the characteristic of K this is 
possible only if the charactenstic is a pnmenumber, say p 


and 


Put 


n =- p* m, where e > 0, (p, m) = 1 
g( x) = x m — 1 


( 2 ) 

( 3 ) 
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Since (g(x), g’{x)) = 1, the polynomial g{x) is separable (see 2-6) and it has 
therefore m different roots in a suitable extension of K As furthermore 
the characteristic of K is equal to p, 

(g(x) f = - 1 = f{x) 

Hence f(x) has the same roots as g(r) has , each of these roots occurs once 
as a root of g{x ), and p“ times as a root of f(x) Thus the problem has been 
reduced to the case where n is not divisible by the characteristic of K 

3-12 Primitive roots Suppose now that n is not divisible by the 
chaiactenstic of K, this supposition holds e g when the characteristic 
is 0 In any field admitting the complete reduction of f(x), this polynomial 
has therefore n different roots The same holds for x" — 1 — 0, where h 
is an arbitrary factor of n 

Let a and /3 be roots of f(x), thus a n = 1 — /?", hence (a /?)” = 1, 
(a jS) n = 7 The roots of f(x) therefore form a multiplicative abelian group 
T Let l be the smallest positive integer for which a T — ] holds, then r is 
said to be the order of a in r As = 7 for every pair of integers s and 
t, a’" = 1, where m = (r, n) Hence m > r, and therefore r is a factor of 
n On the other hand, if i is a factor ol n, then the roots of x r — 1 are at the 
same time roots of x" — 1 The elements of r of order n are called primitive 
roots of f(x) , the number of these primitive roots will be denoted by <p{n) 

To prove that for every n, primitive roots exist, one has to show that 
</>(n) > 0 In the following, the value of <p (n) will be calculated , it will 
be shown to be equal to a well known furction of the elementary theory 
of numbers and to take positive values only 

Let a be a root of order h, and 0 < t Put /?—«', (t, h) = r = at 4- bh, 
and h r = s, t r = u, then (3 s = = 1 on the other hand, let 

0 < s' < s , 

then /3 s “ = a'“ " = a” =/: 1, 

since 0 < r s' < h , hence /8‘ ^ 7, and s is the order of a 1 Thus those and 
only those elements a 1 are of order h, for which (t, h) = 1 Now a is a pri- 
mitive root of x 1 ’— 1 Hence if there exists a primitive root, the number 
of all the primitive roots is equal to the number of the natural numbers 
which are smaller than the exponent and relatively prime to it. Let A be 
an extension of K admitting the complete reduction of f(x) = x a — 1, and let 
a be a root of order h, fi be a root of order k and (h, k) = 1 Consider the 
hk products 
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a T ft’, for r = 0, . h — 1, s = 0, . k — 1. (1) 

Now a T ft' = a r ft' impbes a r ~ r ' = ft’ -• The left hand side has an order 

which is a factor of h, whereas the order of the element on the right is a 
factor of k , since (h, k) = 1, the order of both the sides must be 1. the two 
sides are therefore equal to 1 Hence r — r' = s' — s = 0 Thus the h k 
elements (1) are all different, they form therefore a full system of roots of 
the polynomial x hk — 1 Now it will be proved that a ft is of order h k 

Let {aft) u = a’ ft’, where r == u (mod h) and s == u (mod k) and r, s 
satisfy the same conditions as in (1) As the elements a r ft’ are all 
different, (a ft) n = 1 implies r = s = 0 , therefore u must be divisible by 

h and k, therefore by h k Hence the order of the root a ft is equal to 

h k , the same holds for the elements 

a’ ft’, for which (r, h) = 1, (s, j) = 1 (2) 

as in this case « r is of order h, and ft’ of order k If however (r, h) = v > 1 , 
then (a' /?“) ,lk v = 1 and a r ft’ is of a smaller order than h k , similarly if s 
is not relatively prune to k Hence the elements (2) are the only ones of 
order h k, and therefore the only primitive roots of x hk — 1 Hence 

</>( h k) = ./.(h) ./.(k), for (h, k) = 1 (3) 

This formula can immediately be generalised to a product of any number 
of factors, where the arguments are relatively prime 

Especially 

H<h kl <ir • q m k,n ) — «?-(qi k7 ) ) ^(q t >), (4) 

where q x , q^, , q k are all different pnmenumbers 

To determine </.( q 11 ), where q is a pnmenumber, consider a root a of 

k k -1 

x'' — 1 which is non-primitive Then —1 = 0 There are q k_I ele- 
ments of A satisfying this condition The number of the primitive roots is 
therefore q k — q kl Hence 

V-(q k )-q k (l ) (5) 

From (4) and (5) follows 

i)>0, (6) 

where q 1( . , q m are different prime-factors of n The essence of these 
considerations is given by the following theorem , 
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Theorem. If x” — 1 is a polynomial of K[a;], and the characteristic of 
K is not a factor of n, then this polynomial has n different roots There are 
ip( n) primitive roots of x° — 1 ; </>(n) is equal to the number of the positive 
integers < n which are relatively prime to n and is given by (6). Each 
root of z a — 1 is a power of every arbitrary primitive root of that polynomial. 

3-13 Cyclotomic polynomials of order n Let r 1( r_,, , r m be the divi- 

sors of n which are different from n and (x a — I) (x r i — I) = <p,(x), then 
the h c f of all the polynomials if/ ,(x) is a polynomial whose roots are just 
the primitive roots of x n — 1. This polynomial is called a cyclotomic poly- 
nomial of order n , its degree is <p(n) 

To calculate a cyclotomic polynomial, it is not always necessary to 
compute all the polynomials ip t (x) 

Example n = 12 

The non-primitive roots of (x n — 1) are either roots of (a; 8 — 1), or of 
(x 1 — I), the common factor of both polynomials being (x 2 — I) Hence 
the cyclotomic polynomial of order 12 is (( x 12 — 1) (x e — 1) ) {(x* — 1 ) • 
(x 2 ~ l)) = (x« + I) (x 2 -f 1) = x l - x 2 + 1 

Theorem If a is a root of a® — 1, K(a) is a normal extension of K 

Proof Even if n is divisible by the characteristic of K, (see 3-11), 
a is a root of a polynomial x m — 1, where m is not divisible by the charac- 
teristic The order of the root a is a factor, say h of m (possibly h = m) , 
thus a is a primitive root of x h — 1 The roots of this polynomial are powers 
of a and therefore contained m K (a) This field is therefore a smallest 
extension of K admitting the complete reduction of x h — 1 Hence the 
theorem follows from 2-74, theorem 2 

The factorisation of the cyclotomic polynomials, especially its irre- 
durability in a primefield of characteristic 0 will be considered m 3-433 

3-2 (Jaloisfields 

3-21 Fundamental properties A field r which contains only a finite 
number of elements is called a Oaloisfield. As the primefield of r is finite, 
the characteristic of r must be a primenumber, say p The primefields 
GF P of characteristic p themselves are instances of Galoisfields r must 
be finite over its primefield GF P , otherwise it would contain an infinite number 
of elements Hence r is algebraic over GF p (Bee 2-63), and there exists a 
finite basis of say n elements 


ftfi O P —16 




( 1 ) 
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of r, so that every element of r can be represented in one and only one 
manner by 

0 = &1 “i f + b u a„, (2) 

where b lt , b n are elements of GF V Hence r has exacth p u different 
elements, where p is the characteristic of r and n = (r GFp] The 
elements ^ 0 of r form a multiplicative abelian group A with p“ — 1 
elements Let /? be an element of A, then /?, /3 2 , cannot be all different 
From j8 B = |8 b it follows /3 a >l = 1 Hence each element of A is a root of a 
cyclotomic equation Let r be the order of /?, hence /3, f} 1 , , /3 r are 

all different Let two elements y and 8 be considered as equivalent ele- 
ments if oy- 1 = /?', then this equivalence defines a partition of A in 
classes (see 2-13) Each class contains r different elements Let 8 be the 
number of the classes, then rs = p n — 1 holds Hence the order of every 
element of A is a factor of p* — 1, and therefore every element of A is a 
root of xp”- 1 — 1 As the polynomial cannot have more than p" — 1 roots 
in r, every root is an element of A So the elements of r are identical with 
the roots of xp" — x, and 

p" 

a?" _ x (x - /?,), (3) 

f-i 

where are elements of r. 

Let a be a primitive root of xp"- 1 — 1, then 

T = GF p (a) (4) 

is a normal field over GF p , because it ib the smallest extension admitting 
the complete reduction of xp°- 1 — 1 From 2-52 it follows, that all Galois- 
fiekls with p“ elements are isomorphic For this reason, (4) will also be 
denoted by 

GF V v (4') 

To prove that to every p n there exists a Galoisfield QF p *, it suffices to show 
that in everv field of characteristic p which admits the complete reduction 
of xp" — x, the roots of this polynomial form a field , indeed the p“ roots of 
x(xp'-' — 1) are all different since p“ - 1 is not divisible by p (see 3-12). 

Let a and /8 be roots, then 

(«PF = =«/?, 

(a : /3 )p" a aP D, ^»" = a • /3 , 
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moreover since the characteristic is equal to p, it results from the binomial 
theorem 


(a ± / S)i' n = ± /?»" = a ± 0. 

Hence a 0, a 0, a 4: 0 are roots of £ p * — x These roots form there- 
fore a field Hence the following theorem holds 

Theorem 1. To every power p H of a primenumber p there corresponds a 
Galoisfield OF p n which is uniquely determined in the sense of isomorphism, 
p is the characteristic of the Galoisfield, and n = [GF p n • GF P ]. Every 
element of GF p n is a root of the polynomial (3), and the elements 0 are 
powers of any primitive root of x p” -1 — • 1. 

If r' is a subfield of GF P n, its characteristic must be equal to p. Thus 
r' = OF v m, where m = [r' GF p ] is a divisor of n = [GF p n . OF p \ The 
elements ^ fl of r' are roots of x? m - 1 — 1 Now p“ — 1 = (p m — 1) s, 
where s = 1 -f p m -f . . + P°' m as n is divisible by m. Hence a; 1,n - 1 — 1 

is divisible by (xp"' 1 — 1). There are therefore exactly p m — 1 roots 
of £p m -i — 1 in OF v n Hence 

Theorem 2 If m is a divisor of n, then 0F P n has exactly one subfield 
of the type GF p ™, and these are the only subfields of GF p n 

Furthermore 

Theorem 3 Every finite extension of a Galoisfield has a primitive 
element and is a normal extension 

Proof Every finite extension of a Galoisfield r is again a Galoisfield, 
aay r* = GF p s This field contains primitive roots of f(x) — x pN -‘ — 1 
If a is a primitive root, then r* — r(a), and r* is the smallest extension 
admitting the complete reduction of f(x) , thus r* is a normal extension 
of r Hence the theorem 

The last theorem complements the theorem of 2-72 about the primitive 
elements in finite extensions of fields with an infinite number of elements 
In that theorem, a supposition about separability of polynomials was made 
The reason why no similar supposition occured in the last theorem, will 
become obvious from the following theorem 

Theorem 4 Every irreducible polynomial of GF p n[x] is separable 

Proof Suppose f(x) to be irreducible and non-separable ; then f(x) = 
g{x») = a 0 + *p + .. 4 ■ a m x m Since the coefficients a 0 , ...a m are 
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elements of 0F P «, so a t — ajv u = bf Hence f(x) = 6 0 »-f- (b t x ) # + (b m x m ) v 
= (b 0 + b 1 x + + b m *“)» is reducible, contrary to the supposition 

made for f(x) 

It may be mentioned especially that though every primitive root 
— 2 is a primitive element of the extension r = QF P * over its prime 
field K = 0F P , not every primitive element of the extension is a primitive 
root The only condition for an element a of r to be a primitive element, 
is that it must be a root of an irreducible factor o" £ p “ -1 — 1 which has the 
degree n Then [K(a) K] = n = [r K] and therefore [r . K(a)] = 1, 
and therefore r = K(a) This condition can be satisfied by roots of x^- 1 — 1 
which have a smaller order than p” — 1 For an example, see 3-23 

3-22 Automorphisms To investigate the Galoisfields somewhat 
closer, consider the automorphisms of them Every automorphism of a 
field leaves the nullelement and the umtelement invariant , the Bame holds 
for the elements 2, 3, as these elements are generated by a repeated 
addition of the umtelement, thus the elements of the pnmefield of a Galois- 
field are not altered by an automorphism r = GF V n is a normal extension 
of the pnmefield K = GF P An automorphism of r is therefore (see 2-742) 
uniquely determined if it is known into which root of the same polynomial 
any root of an irreducible polynomial of degree n is to be transformed 
Hence there exist exactly n automorphisms of GF V « These n automor- 
phisms can be constructed m a very simple way 

Theorem 1 Let (3 run over the p" elements of GF V ■■, then the mapping 
P — » p v is an automorphism A of GF p n 

Proof As m a field of characteristic p, (a p) v = a p ft", (a ■ ft) = 
«■ F, («±(3) p = a p 4 - ps> hold, the operations of addition, subtraction, 
multiplication and division are invanant The mapping is therefore in- 
variant, hence it is a homomorphism The image consists of more than 
one element , it is therefore a field A field cannot be homomorphic to a 
field unless the fields are isomorphic Hence the image consists of p“ 
different elements of GF p a t and therefore of all the elements of GF V ^, thus 
the mappmg is an automorphism 

By repeating the automorphism A one gets another automorphism A 2 
mapping p on /} p2 , correspondingly A s , , A”. The last of these automor- 
phisms maps p on /? p ” = /?, hence A a is the identity A primitive root is 
transformed by 

A,..,A‘ (1) 
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into different elements Hence these n automorphisms are all different, 
and as there exist exactly n automorphisms of GF p u, the automorphisms (1) 
form a full set of the automorphisms of OF p n Hence the following theorem 
holds 

Theorem 2 The automorphisms of GF V << consist of the transforma- 
tions (1), where A k maps every element /? of OF f « on /h 1 * The elements of 
the primefield remain invariant for every automorphism 

Corollary Let fix) be a polynomial of GF p [x\ , if a is a root of fix), 
then a”, «t> 2 . . are also roots of f(x) 

Proof GFp(a) is a Galoisfield, say GF p « By the automorphism A 1 
the coefficients of fix) are invariant , therefore a is transformed into a root 
of f(x) Hence a pk is a root of fix) 


3-23 Calcvlation in a Galoisfield To show how calculation is done 
in a Galoisfield, an example will be considered now The elements of GF.i 
are roots of x- ' — x — 0 The cyclotomic polynomial can be calculated by 
the rules of 3-13, it is x e — x* + 1 This polynomial cannot be irreducible 
in OF, [a:] since [FG s i GF ,] = 2, and therefore every element is of order 
2 over GF 5 Indeed in GF, there is 

(x s - x* + 1) — ffix) f n (x) f 3 (x) /fix), (1) 

where fi(x) = x l — x + 2, f„(x) = x- + x + 2, ffx) — x l -f 2x -f- 3, fi(x) — 
x 2 — 2x -f- 3 If a is any primitive root of x 2i — 1 , then the other primitive 
roots are a\ a 1 , e- n , a 13 , a' 1 , « 19 , a 21 , since the exponents must be relatively 
prime to 24 Furthermore a'" = a 2A '“ , a 12 = — 1 From the last corollary 
it follows that if a m is a root of f(x), a r,m is the second root Hence the 8 
primitive roots consist of the four pairs 


In these 4 pairs, each root is the S'" power of the other one But one cannot 
allot the pairs (2) in an arbitrary manner to the 4 irreducible polynomials (1) 
as their roots Without loss of generality, suppose that a is a root of ffix), 
then — « = a 13 is a root of ffx) Since a is a root of /,(*), 

cfi = a-2 (3) 


a 3 = a'i - 2a = — (« + 2), a* = a 8 -f 4a + 4 = 2 Hence 

a* = 2 = - 3, a” = 4 = - /, «'• = 3 = - 2, a*« = 1 = - 4 (4) 
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Therefore a 7 = 2a, a 14 = — a- = 2 — a Hence a 7 is a root of / 4 (x) Thus 
the 4 pairs of roots (2) correspond to the 4 polynomials (1) in the order as 
given there 

h « ( 6 ) 

form a basis of OF c- over GF S Thus one can express every element in 
the form 

a + b a, (6) 

where a and b arc elements of OF „ i e integral numbers mod 5 The multi- 
plication of two elements (6) can be effected by 

(a + b a) ( a ' b' a) = a" -f b" a, 

where, as a consequence of (3), 


a" = aa' -f 3 bb' 
b" == ba' + (a + b)b' 

From (7) one gets a' + b' a — (a" -f- b" a) ( a + ba), where 
a' = [(a + b)a" + 2 bb"] (a 2 -f ab -f 2b-) 

b' = [46a" + a6"] (a 2 + ab + 2b 2 ) 


( 7 ) 


( 8 ) 


Since m a field the division by every element 0 can be performed, one 
must expect that for elements a, b of GF-„ the divisor on the right hand side 
of (8) cannot be 0 unless a — b = 0 Of course for b = 0 the divisor cannot 
be equal to 0 unless a — 0 For b =/= 0, put a b — x , then x l -(- x + 2 — 
f,(x) [see (1)] is irreducible , hence theie exists no a b — x in OF,, by which 
the equation x l -f x + 2 = 0 could be satisfied By the help of (7) and (8), 
the multiplication and the division of any pair of elements (6) of GF ,a can be 
performed, but the method is not very convenient 

To make the calculations easier, one may express a, , a 24 by the help 
of the basis (5) 


a 

— 


« 

a T 

= 


2 a 

a 11 = 



4a 

a 10 

= 

3a 

*> 

a- 

= 3 

+ 

CL 

a* 

= / 

f 

2a 

a 14 = 

2 

+ 

4a 

a 20 

+ 

1! 

3a 


= 3 

+ 

4a 

a" 

= 7 

+ 

3a 

a 15 — 

2 

+ 

a 

a 21 

= 4 + 

2a 

a 4 

= 2 

+ 

2a 

a'° 

= 4 

+ 

4a 

a lb = 

3 

+ 

3 a 

a 22 

= 1 + 

a 

a 5 

= 1 

+ 

4a 

a 11 

= 2 

+ 

3a 

a 17 = 

4 

+ 

a 

a 25 

+ 

"0 

II 

2a 

a® 

= 2 



a 12 

= 4 



a 18 = 

3 



a 24 : 

= 1 



( 9 ) 



C4LCTJLATION IN GALOJSFIELDS 


127 


In this table, one gets every column from the preceding one by multiplying 
with a 11 = 2, thus very little calculation only was necessary It is conve- 
nient to supplement (9) by a second table, wherefrom the exponent m of 
a 1 ” = a -(- ha is seen when a and b are known The rows in the following 
table give the values of a , the columns give the values of b 


b 

> = 0 

1 

2 

3 

4 

a = 0 


1 

7 

19 

13 

1 

24 

22 

H 

9 

5 

2 

6 

15 

4 

11 

14 

3 

1H 

2 

23 

16 

3 

4 

12 

17 

21 

20 

10 


Eg To calculate (2 -f- 3a) (4 4 2a), one finds in ( 10 ) that the exponents 
of the two factors are 11 and 21, the product is therefore equal to a", and 
from (9) one sees that a* — 1 4 2 a In other examples, one has to proceed 
in a similar way. When the prmicnumber p is very large, it is convenient 
to provide for a special table for the multiplication in the pnmefield GF p , 
the arrangement of the “log-tables” (9) and (10) must be made according to 
the special needs of the problem concerned 

3-24 Application on the theory of numbers The theory of the Galois- 
field is very closely connected with the elementary theory of numbers Some 
anthmetical propositions are immediate consequences of properties of the 
pnmefields GF V Equality of two elements of GF p means that the corres- 
ponding integral numbers are congruent (mod p) As in GF V> every ele- 
ment is a root of x ”‘ 1 — /, it follows for integral numbers 

Fermat's theorem 

n p 1 ss 1 (mod p), foi n not divisible by p ( 1 ) 

To every element a 0 m GF P », there exists an inverse element /3 
such that a /?= 1 , only 1 and — 1 are roots of x- = 1, and therefore self- 
inverse, the other elements 7 ^ 0 are divided into pairs of inverse elements 

Hence the product of all numbers 0 of GF P » is — 1 

For n = I, it follows Wilson’s theorem 

(p - I) ' == — 1 (mod p) 


( 2 ) 
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Let p be ^ 2, p n — 1 — 2m As xp“ — 7 = (x m — 1 ) (x m + I). there 
are 2 classes of elements ^din GF p n, the m th power of the elements of the 
first class is -f- 1, and the m l " power of those of the second is — 1. If a is 
a primitive element of GF p n, the numbers a, , a 2m = 1 are all different, 
hence <*"'= —7, and therefore the odd powers of « ln are —1, the even powers 
are +7 If yS = y ", then /?"' = (y m ) 5 = (± J) 1 — 1 Hence every square 
is an element of the first kind, and every element of the first kind is an even 
power of a and therefore a square The product of two elements of a 
different kind is of the second kind and the product of two elements of the 
same kind is of the first kind The element — 1 is of the first kind if 
(— 1)'" — 1 = 0, ic if m is even, and it is of the second kind if m is odd 

In GF f „ to every element y, there corresponds a class (c) of elements 
congruent c (mod p), and a is a square m 0F p if and only if x 2 == y (mod p) 
has solutions In this case y is said to be a quadratic residue of p , if there 
is no solution, y is a quadratic non-residue Hence there are |(p — 1) quadra- 
tic residues and Ji(p — 1) quadratic non-residues The product of two 
residues (two non-residues) is a residue , the product of a residue and non- 
residue is a non-residue 

Again put p = 2m -f- I , then the element — 1 of GF P is a square if 
and only if it is a root of x m — 1, i e if m is even, say m = 2n Hence 

— 1 is a quadratic residue of the pnmenumbers 4n + 1 , 

( 3 ) 

— 1 is a quadratic non-residue of the pnmenumbers 4n — 1 

The primitive roots of x>’“ — 1 are the roots of the cyclotomic polynomial 
which is of degree <£(p" — 1) In <?F p », each of these roots is a root of a 
polynomial of degree n which is irreducible in GF P , since a primitive root 
is a primitive element of GF p n when this field is considered as an extension 
of its pnmefield The cyclotomic polynomial is therefore a product of poly- 
nomials which are irreducible in GF V and each of degree n 

Hence <£( p" — l) is divisible by n when p is a primenumber 


*3-25 Application on finite geometries Galoisfields have been used to 
construct finite geometries Consider e g plane projective geometry Its 
fundamental notions are . point, straight line and the relation of incidence 
Analytically, the points as well as the straight lines are represented as classes 
of triplets of real numbers , two triplets belong to the same class if they 
differ only by a common factor ^ 0 , the triplet (0, 0, 0) must be omitted. 
Thus one considers points P— p(x t , z 2 , x a ) and straight lines g = u 2 , u,]; 
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the condition of incidence is 


x l u x + Xi u 2 + x 3 u s — 0 

For a certain portion of plane projective geometry, it is not important 
that the numbers p, a, x [t u s are real , they may be complex or they may 
be elements of any particular field If one selects a Galoisfield for this 
purpose, one gets a Veblen geometry , and the number of points and straight 
lines is finite 

The simplest case corresponds to GF, Here 0 and 7 are the only ele- 
ments The arbitrary factors p 0 and a =/L 0 are equal to 7 and can 
therefore be omitted This geometry consists of 7 points and 7 straight 
lines 

A = (0, 0 , 7), R = (0, 7, 0), C = (0, 7, 7), D = (7, 0, 0 ), E = (7, 0, 1), 

F = (7, 1, 0), G = (7, 7, 1) 

a = [0, 0, 7], h = [0, 7, 0], c = [0, 7, 7], d = [7, 0, 0], e = [7, 0, 7], 

f =[7,7,0], <7 — [7, 7, 7] 

Every straight line is incident with 3 points, and these are distributed on the 
straight lines as follows 

a B D F 

b A D E 

c C D G 

d ABC 
e BEG 
f A F G 

g C E F 

In a similai way, one gets more complicated Veblen geometries by using any 
kind of Galoisfield to build up a projective, or an affine geometry in n dimen- 
sions 

3-251 Application on statistical analysis Consider now the above 
scheme without any respect to its geometrical significance nor to the algebraic 
method by which it has been attained It consists of t> = 7 varieties A, B,C, 
D, E, F, Q, each being repeated i = 3 times m the scheme. The varieties 
are arranged into 6 = 7 blocks each consisting of k = 3 different varieties. 
Each of the v(v — 1) . 2 pairs of varieties occurs m A — 1 blocks. Schemes 

69 0. P.— 17 
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of this kind (for various numbers v, b, r, k, \) are called m Statistics balanced 
incomplete block designs, and they seem to be very important for the design 
of agricultural experiments amenable to exact statistical analysis In 
the last few years the Galoisfields and the methods of finite geometry have 
been used successfully for the construction of those designs It is a startling 
idea that Galoisfields might be helpful to provide people with more and 
better food 

3-3 The fields K(i) 

3-31 The general ca <u> Given a field K, where 

+ 1 ( 1 ) 

is irreducible , consider the extension K(i) of K, when t is a root of (1) 
At first, let K be of characteristic p In OF.,, a; 2 + 1 = (x + ly, and for 
p = 4n f 1, the polynomial (1) is reducible since — 1 is a quadratic 
residue of p(see 3-24) , hence p = 4n + 3 K may also be of characteristic 
0 , indeed (1) is irreducible in every field which consists of real numbers, 
and in isomorphic fields If a and b are elements of K, 

then a- + b- — 0, implies a = b = 0, (2) 

since for b 0, the element a b of K must be a root of (1) The field K(t) 
is isomorphic to the field formed by the classes of residues of the polynomial 
(1) in the ring K[r] A L Cauchy introduced the complex numbers in 
this manner choosing K as the field of the real numbers The elements 

J, i (3) 

form a basis of K(i) Thus the elements are all represented by 

a -fit, (4) 

where a, b run over all the elements of K The multiplication formula is 

(a + bi) (a 1 + b'l) = a” -f b"i, 

where 

a" = a a' — b V , b" = ab' + ba' (5) 

The division-formula is therefore 

(«" + b”i)' (a + bi) = a' + b'i, 

where 

a' = (a a" -f b b") (a 1 f b 1 ) 

b' = (a b" — b a") (a 2 + b 1 ) (6) 
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Independent of the general theory given before, one can define the field 
K(t) in the following manner “Given a field K, satisfying (2) , then pairs 
a,6 of elements of K form a vectorspace of elements which are denoted by (4) 
Determine now the multiplication of the vectors by (5), and in consequence 
of it, the division by (6), then the vectorspace is made a field in which 
0 + 1 1 and 0 — 1 1 are the roots of x 1 -I- 1 ” It is left to the reader to check 
this statement in all its details Put the field of the real numbers for K, 
then one gets the most usual way of introducing complex numbers into 
analysis By interpreting the vectorspace of rank 2, formed by the elements 
(4) as an Euclidean space, one comes to the ordinary geometrical repre- 
sentation of the complex numbers 

In K(i)[x], theic is (x- -f 1) = (x -f- t) (r — t) The field K(t) is a normal 
extension of K, and i is a primitive element From 2-742 it follows that the 
extension admits only one automorphism A which is different from the 
identity, and that A interchanges the elements + i and — i Elements 
which are interchanged by A are said to be conjugate , and the conjugate 
of an element a is denoted by a Hence 

a ~ a + 6 i implies & a — b i (7) 

Hence an element is selfconjugate if and only if it belongs to K The product 
of 2 conjugate elements is selfconjugate and is said to be the norm N of 
those elements Using the notation (7) one gets therefore 

a a — N(a) — X(a) — a 2 + 6- (8) 

From (2) and (8) it follows therefore that N(a) = 0 implies a = 0 Further- 
more 


N(a/3) = N(a) N(P), N(a f3) - N(a) N([3), 
and for elements a of K, there is N(a) = a- 

3-32 The, field R{i) Consider in paiticular the field R(i), where R 
denotes the field of the rational numbeis The elements of R{i ) can be re- 
presented by (a + In) c, where a,b,c are integral numbers Hence R(i) 
is the quotientfield of the integral domain <S' which consists of all the elements 
a -f- bi, where a and b are integers If a 0 is an element of S, then N(ee) 
is a positive integral number , if in particular a is a unity of S, then N(a) 
and N(l a) — 1 N(a) must be positive integral numbers, and there- 
fore N(a) — a? + b 1 = 1 holds Hence 1, — 1, t, — » are the only unities 
of S, and at the same time the only elements of S for which the norm is equal 
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Theorem 1 8 is a Euclidean domain 

Proof. It suffices to show that for the elements a of S, the function 
N(a ) has the properties of a norm-function as required in 2-42 and 2-44 
Indeed for every the norm N{a) is a positive mtegral number, and 

from N (a f3) = N(a) N(/?) it follows, that N(a P) > A T (a), where equality 
holds if and only if /? is a unity, hence the conditions of 2-42 are satisfied 
It remains to prove (see 2-44) that to every pair a, =£ 0, a, 0, of elements 
of 8 there exist such elements p and a, that 

a, -|- (S a, = a,, where N(a 2 ) > N(« 3 ) 

To prove this proposition, consider the element = oq <x, = i -f- si of Ji(i) 
The rational numbers r and s can be represented by ? =. a + r', 8 — b + s', 
where a and b are integers and | r' | g | s' | *5 \ Hence » = £ + <«', where 
p is an element of S and N(">') = r'- + »' ! S 1 By multiplying with a 2 
it follows aj = p a 2 a. But •»' a, = a, — pa i must be an element of 
8, say a s and N(a, ( ) = N(«_,) N(<*>') g \ N (<*_,) < N(«,) Hence the theorem 

Corollary In S the factorisation is unique, and for every two elements, 
there exists an h c / which can be deteimmed by the algorithmic of the 
h c f 

Proof It has been shown in 2-44, theorem 2 that the property holds 
m every Euclidean domain 

S contains the domain I of the mtegial numbers as a subring It is 
interesting to compare the factorisation in 8 with the factorisation in I 
Every rational number contained in S is an integral number Hence if an 
integral number a is divisible in S by an integral number b, then a b is 
integral, and therefore a is divisible by b also in I If an integral number 
is divisible by an element a of S, it is also divisible by a as the divisiblity 
is invariant for the automorphism A If /? is a factor of a, then N (/?) is a 
factor of N(a) Hence if N(a) — p is a pnmenumber, a is a prime-element of 
8 Let now ir by any prime-element of S, and let N(tt) = n 4 be divisible 
by a pnmenumber p, then p must be divisible by v or by its conjugate and 
therefore by both of them Hence either the two elements are associated to 
p and therefore it = \ v p (where v = 0, 1, 2, 3), or N(ir) =; p A prime-ele- 
ment of S is therefore either associated to a pnmenumber or it is generated 
by splitting a pnmenumber into two conjugate prime-factors Consider 
the three cases 

(1) p = 2 = *( 1— t) 2 The primenumber 2 is associated to the 
square of 1 — » , this element of S is a prime-element as N( 1 — ») = 2 

is a primenumber. 
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(2) A primenumber of the type in + 3 cannot be represented as a 
norm of an element of S as N (a) = a' 1 + b 2 yk 3(mod 4) Hence these 
pnmenumbers cannot be split up into two conjugate prime-factors , hence 
they are prime -elements of S 

(3) The pnmenumbers p = 4« -j- 1 aie products of two non-associated 
prime-elements 

Proof As it has been shown in 3-24, the number — 1 is a quadratic 
residue of p Hence there exists an integer d such that d* + 1 = pm On 
the other hand d 2 -)- 1 — (d -f- i) (d — t), but none of the factors on the right 
hand side is divisible by p , hence p cannot be a prime-element of S , it is 
therefore a product ( a -f- hi) ( a — bi) of conjugate prime-elements If these 
elements would be associates, they could differ by a factor ± 1 or ±i 
only This is possible only if either a — 0, or 6 = 0 or | a | = | b | Since 
p is supposed to be a piimenumbei of the type in -p 1, these cases cannot 
occur, and p is a product of two non-associate prnne-elements of S 

If the pnmenumbers p, and p, have a common pnmefactor ir, then its 
conjugate is also a common prime-factor of them and p , = p, holds The 
factorisation of the pnmenumbers into prime-elements of S is therefore 
determined by the following theorem 

Theorem 2 The pnmenumbers ol type in f 3 are prime-elements of 
8 , the primenumber 2 is associate to the square of the prime-element 
1 — i , the prmienumbers p, = 4 + 1 are equal to products of conju- 

gate and non-associate prime-elements All these prime -elements are 
different and non -associate, and every pnme-element of S is associate 
to one of them 

Denoting the pnmenumbers of type in |- 3 by <?, , q,, and the 
prime-elements a ~r b i by 7r,, tt 2 , , the elements 0 of S can therefore 
be represented by 

a = — t) r g n , - qu, Tin, ir mt (1) 

Hence 

N(a) = « n — 2' q , h 1 . . y „ s 2 p mi . . p mi , (2) 

where the p’s are pnmenumbers of the type 4 n + 1 and products of two 
conjugate prime-elements Obviously every product (2) (where neither the 
q’s nor p’s are necessarily different one from another) can be considered 
as a product of two conjugate elements as An integral number is there- 
fore the norm of an element yA 0 of S if and only if it can be represented by 
(2) On the other hand the necessary and sufficient condition for a norm 
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is that it is the sum of two squares Thus the considerations about the 
ring S lead to the following theorem on integral numbers 

Theorem 3 A positive integral number c can be represented as the sum 
of two squares if and only if in the representation of c as a product of powers 
of different primenumbers, the primenumbers of type 4re + 3 occur with 
even exponents only 

3-33 A generalisation If K is the field of the real numbers, and a 
runs over K(i), then N(a) runs over the non-negative real numbers The 
sum of two norms N(a) + N {/?) is again* a “norm” and is different from 0, 
unless a = ft = 0 This case admits an interesting generalisation which con- 
cerns a special case of what has been considered at the end of 2-742 

Suppose that K and A are two fields such that [A K] = 2, and that 
to every pair of elements («, «') 7^ (0, 0) of A, there exists an element 
a" --£■ 0 satisfying 

a a -\- a ' — a " (1) 

Let a,, a„,a l be different from 0, 0, 0 Without loss of generality suppose 
a, 0 Then (a, a 1 + a 2 “2) + “i s -i — a 4 a 4 + a 3 S H = a a, where a 4 and 
therefore a are different from 0 By repetition of this procedure one gets 

Theorem I If K and A have the properties as supposed here, for every 
w-tuplet , . , «„ -/= 0, , 0 of elements of A there is 

“1 ®i + . + « n = as =£0 (2) 

For elements a,, , o„ of K, it follows from (2) , 

a, 2 + + a,, 2 = k"-^0 . (2') 

By putting — = a„ = 1, one sees that the characteristic of K must 

be 0 Moreover a, s, + + <*„ a n + P 7^= 0, and therefore the left sides 

of (2) and (2') are always 7^ — 1 This shows that the field K cannot be 
chosen arbitrarily The following theorem is important for the theory of 
matrices (see 6-5) 

Theorem 2 If K and A have the properties as supposed here, and 
v 1 ,, , v’ u 7^ 0, . ,0, are elements of A, then there exist in A, n 2 elements 

* N(a) = a cr as in 3 32, but sinoe it is not always an integral number, it is not 
a norm function (see 2 42 and 2-44) , for this reason “ norm ” is put in inverted 

oominas 
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mV and an element satisfying the conditions X M V “V = 0, for t =£- j, 

k 

S «V «’k = f . “1 «'k = f’k (fc = 1, , n) 

k 

Proo/ Let ey, ■ , v\, be any solution of X v 1 u x k = 0 , if n > 2, 
then the two equations 5 x^ ~ 0, 2 t>\ x y — 0 again admit solutions , 
let trV, , e’ n be one of them Continuing in this manner, one gets n 2 
elements t>V for which % v\, v\ = 0, for i yfc j holds Now X t>V ®‘ k = a^i 

k It 

=^= 0 Hence u\ — v\ a { is a system with the required properties 

Exercises 1 The elements a a — a' a' form a field 

2 This field is identical with K 

3 If every “norm” as is equal to the “norm” a 2 of an element of K, 
then A = K(i) 

3-4 Irreducibility of polynomials 

In nearly every application of the methods of general algebra to a 
particular problem, one is faced with the task to establish the irreducibility 
of a polynomial As irreducibility depends on the coefficients of the poly- 
nomial as well as on the field for which it should be proved, the problem 
needs paiticular investigations for the single cases Criteria have been 
developed especially for the field of the rational numbers Some of the 
principles used there can be generalised for a larger class of fields 

*3-41 A general method Considei at first a method which, though 
applicable to every field, is nevertheless of little practical use A polynomial 
f(x) of degree n will be shown to be irreducible m K(x] if and only if a 
homogeneous equation of the n' 11 degree, F(. r,, , x„) = 0 has no solution 
a, , . , a„ in K The coefficients of F belong to the ring generated by the 
coefficients of f(x), and they can be calculated by elementary methods 
However it is m general not easier to show that F(x , , , x a ) has no solution 

in K, than to establish the irreducibility of f(x) On the contrary, the lrre- 
ducibility of f(r), when proved, helps to show that F has no solution in K 
The investigation runs as follows 

The classes of residues of a polynomial f(x) in K[x] form a rmg whioh 
is a field if and only if j(x) is irreducible m K\x] (see 2-47) Let f(x) be of 
degree n, and n be a root of f{x) , a", a' 1 * 1 , can be linearly expressed by 
1 , a, , a" 1 , the coefficients being dependent on those of f(x) only Hence 

(*, a + . + a -,,-1 a" 1 + X„) (,y, a - f + J/n-i «" _I + &>) = 

Z, a + . . -f 2,,-t «“ +1 4- Z ui , (1) 

* May be onumtted at the first reading , - J 
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where 

2 . = 2 a>, k X ! y u for y = 1, • . n, 

1 k 

9 

and a> )k * are elements of K 

The elements 2, a, . , a"' 1 form a basis of a field K(ar) with (1) deter- 
mining the multiplication if and only if to every pair of systems (x v . . ., x n ) 
and (z v , 2 „) of elements of K there exists a uniquely determined 
system (y v , y n ) Hence the equations 

y, d\ -f + y„ d> a = z„ where d> k = 2 ahi, x, 

must have a uniquely determined solution The necessary and sufficient 
condition is det (d\) ^ 0 But the left hand side of this inequality 
is an homogeneous polynomial F of degree n in x t , > x u Hence 

F(x v ,x n )^L0, (2) 

for every system (x lt , x„) of elements of K is the necessary and sufficient 
condition for f(x) to be irreducible m K(r) 

Exercise To eveiy polynomial x' -f px -- Q which is irreducible in 
_R[x] there corresponds a curve of the third class % a 1Jt! w, u t u k = 0 such 
that none of its tangents (u v u,, M a ) passes through more than one point 
with rational coordinates Compute o, Jh 

3-42 Reduction of the problem to the investigation of irredypibility tn 
D[x] In many cases it is possible to replace an investigation on irreduci- 
bility in K[x] by an investigation on reducibility in D\x~\, where D is an in- 
tegral domam This reduction of the problem to an easier one is based on 
the following lemma 

Lemma Let D be an integral domain with unique factorisation, K = 
Q(D) the quotient-field of D, f(x) and <p(x) polynomials of D[x], and f(x) 
be a primitive polynomial , if f(x) is a factor of <p(x) m K[x], then it is also 
a factor of <p(x) in D[x] ; if f(x) is irreducible m Dfx] lt'is also irreducible 
m K[x] 

Proof. Let <f>(x) = f(x) <p{x), where ip(x) is a polynomial of K[x] 
and f(x) is primitive in D[x ] (t e the common factors of its coefficients are 
unities of D only) , then there exists an element a of D such that aty(x) 
belongs to D[x]. Hence a<p(x) is divisible by f(x) m H[x], From 2-48, 
corollary it follows that <p(x) is divisible by f{x) in D[x], Let now <?(x) be 
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divisible in K[x] by any polynomial <p 1 ( x ) of degree m, where 0 < m < 
degree <p(x ) Then <p x (x) = ( a:b)f(x ), where a and b are elements of A 
whereas f(x) is a primitive polynomial of D[x] and of degree m Hence <f>(x) 
is divisible by f(x) in D[x] Irreducibihty in D[x] therefore implies lrreduci- 
bility m K[x] Hence the lemma 

3-421 Irreducibihty m R[x] When the reducibihty of a polynomial 
<//(x) of R[x\ has to be investigated (R, as usual, denotes the field of the 
rational numbers) one may suppose without loss of generality that the 
coefficients of <p(x) are integral numbers From the preceding lemma it 
follows that it suffices to inquire whether </>(x) is reducible in I[x], where I 
is the ring of the integral numbers. The method described below is appli- 
cable in principle to every case as it leads to a decision after a finite number 
of steps Its practical apphcation however needs a very skilful handling, 
otherwise that finite number may become unpracticably large 

Let <ji(x) — if/(x) <p y (x), where the coefficients are integral numbers, and 
</i(x) is of degree 2n or 2n + 1 Without loss of generality suppose that 
the degree of i/>(x ) is § n. Let a 0 , a lt . , a„ be arbitrary different 
integers, then <p(a,) — ^(a,) , hence i/'(«i) is a factor of <£(«,) Let 

<p(x) = y a + y x x + . . -f y n x n , and let g 1 ^ . . ,g k >, be the different factors 
of <p(a ,) The integers y must therefore satisfy one of the following systems 
of n + 1 linear non-homogeneous equations 

2/o + « 0 Vi + • - + a 0 ” y n = g\ 


y» + a a yi+ ■ ■ + < y a = g\. 

To every system r 0 , . . , r n , 0 g < kj there exists a system of equations, 
and to eVery system of equations there exists one solution (y a , . . y n ) as 
the determinant D a of the homogeneous systems is ^ 0 To prove the last 
statement one may use mathematical induction For n = 1, D 1 = a x — a 0 
0 In the general case, apply the method of sweep-out to the first row 
by multiplying the first column successively with a 0 , a 0 2 , , a 0 " and 

subtracting it from the second, third, . (n + l)st columns respectively 
Hence Z>„ = (a 1 — a n ) (a, — o 0 ) (a a — a 0 ) £>', where 

1 1 K + ®o) («i s + «i«o -f «o 2 ) K" 1 + V 2 «» + - + ao- 1 ) j 


1 1 K + a„) K J + «n®o + ®u 2 ) • (On 11 ' 1 + On"' 3 ®o + + Oo"' 1 ) I 

as one gets easily by successive column-addition. Thus by mathematical 
induction it follows that D„ ^ 0 (for a different proof, see 3-53). Of the 
60 0. P .— 18 
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solutions y 0 , . . y n , one has to consider only those which consist of integral 
numbers , finally check by the algonthmus of division whether the polyno- 
mials y a -f y x x + . + y u x” are factors of q>(x), otherwise </>(x) is irreduci- 

ble. This method applies to every integral domain, where the factorisation 
is unique and there exists a method of factorising any element by a finite 
number of steps, as can be done m 7 

3-43 Method of homomorphism The method of 3-42 becomes more 
powerful if used in connection with considerations on homomorphism Let 

, <p(x) =/(*) <P(x), (1) 

the three polynomials belonging to D[x ] By a suitable homomorphism, 
D is mapped on an integral domain D 1 , the equation (1) is transformed by 
the same homomorphism into 

<M*) =■ /i(*) fiM (2) 

Hence if tj.^x) is irreducible, one of its factors, say must be associated 

to a umty of D, and therefore <i>(x) and f{x) arc mapped by the homomorphism 
on associated polynomials In this manner, it is often possible to show that 
irreducibihty of ^> x (x) in D^x] implies the irreducibility of <j>(x) m D[x ] 
Cnteria of irreducibility obtained by this method furnish conditions which 
are sufficient but not necessary for irreducibility, as the reducibility of ipfx) 
does not imply the reducibility of <p(x) For the application of this method 
it is essential that D is not a field, as any ring which is homomorphic to a 
field is isomorphic to it 

3-431 Eisenstein's theorem The method explained in 3-43 will be 
applied now to prove a theorem from which many more special criteria of 
irreducibility have been derived It will be announced here in it3 original 
form, though it can easily be generalised to any integral domain with unique 
factonsatiorl 

Eisenstein's Theorem Let a„, , a n be integers, a ot , a n _ x be divi- 

sible by an arbitrary pnmenumber p, a„ be not divisible by p, and a u 
not by p J , then f(x) = a D + a v x + + a„ x” is irreducible in 7?[x] 

Proof Represent the domain 7 of the integral numbers and 7[x] by 
the classes (mod p) I is mapped homomorphically on a field GF V and 7[x] 
on CrF p [x] which is a ring with unique factorisation Let f(x) = f^(x) f-fx), 
where degree / t (x) = s > 0 and degree f(x) = t > 0 By the homomor- 
phism it follows 

( /(*) ) = ( /i(*) ) ( /,(*) ) = («,,) (*) . . (x) 
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(where the brackets are used for denoting the classes of residues as m 2-13) 
As (x) is a prime-element in D[x] and the factorisation in this domain is 
unique, it follows that 

(fi(x) ) = (c) (x) 8 , (/ 2 (x) ) = (d) (x)\ 8 + t — n, c and d not divisible by p 
Hence 


Mx) = ex' + P'pfx), / 2 (x) = dx ' + p<f> 2 (x). 

Hence the term a Q m the product f L (x) f 2 (x) is divisible by p 2 , contrary to 
the supposition Hence f(x) is irreducible in R\x\ 

3-432 A special case The same homomorphism will be used now to 
prove the irreducibility of 

* \j/(x) = x* -j- p(ax’ > + bx' 1 + cx) + d, 

where pis a pnmenumber of the type 4 m + 1, and d is a quadratic non- 
residue of p A polynomial of degree < 4 cannot be congruent to <//(x) 
mod p Hence it suffices to prove that x* + d is irreducible in GF p [x] 
As — 1 is a quadratic residue, there is an integer e such that (e) 2 — — 1, 
(e) 8 = 1 , furthermore — d is a quadratic non-residue Hence x* + d s 0 
(mod p) has no solution, and therefore (x* -f d) has no factor of degree 1 
m GF V To prove that it has no factor of degree 2, extend GF V by a root S 
ofx'+d In GF P (8) 

s + d = ff(t-in 

Each factor of degree 2 has therefore a coefficient 8 2 e V * ,L Since 8 2 does not 
belong to GF p , x 1 + d is irreducible m GF P Hence \p(x) is irreducible in I 
and therefore also irreducible in R 

3-433 Irreducibility of the cyclotomic polynomials in R[x] The same 
method has to be applied m a somewhat more subtle manner to prove the 
irreducibility of the cyclotomic polynomials m ff[r] 

Theorem The cyclotomic polynomials are irreducible in the field R 
of the rational numbers 

Proof It suffices to prove the irreducibility in I[x] where I is the ring 
of the integral numbers Consider x" — 1 as a polynomial in 7[x] and let a 
be a primitive root of it in a suitable extension R(a) Then one gets all the 
primitive roots m the form a m where (m, n) == I Henee one has to prove 
that if a is a root of a primitive polynomial f(x) which is irreducible in I[x J, 
then a' u is also a root It suffices to prove it for primenumbers p which are 
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relatively prime to n ; the general case follows from it by a trivial mathemati- 
cal induction to be taken over the number of the prime-factors of m. As 
f(x) is a primitive polynomial, it follows from the lemma that it is a factor 
of x 1 — 1 in I[x\ The coefficient of the highest term is therefore a unity, 

say 

f(x) = X’ + o r .! x r ~' + ... +a 0 (1) 

The elements 1, a, ...,a r ' 1 form a basis of R(a) The representation of 
the elements of R(a) by 

S b, «< (2) 

with rational coefficients bi is therefore umque R(<*) has a submodule M 
consisting of those elements for which the coefficients hi are integral numbers. 
The module M contains 

■* 

a r = — Uf.j a 1 " 1 — . — a 0 , a ** 1 = — o r _! a T — , . — a 0 a 

Hence M is a ring As the coefficients of f(x) are integral numbers, f{x) 
can also be considered as a polynomial in QF v [z], where p is any pnmenumber 
which is relatively prime to n Let /8 be a root of f(x) in GF p [x ] , so it is a 
primitive root of x" — 1 Map the nng M on GF p {f3 ) by the correspondence 

2 hi «' -> 2 b t /?', (3) 

then to every element of M, exactly one element of GF v (/3) corresponds 
Addition, subtraction and multipkcation are invariant for the representa- 
tion (3), it is therefore a homomorphism Different elements of M may 
correspond to the same element of GF v (p) but not conversely To a>, for 
j = 1, , n, there correspond the elements /3 J , and as these n elements 

are all different, no two different elements can correspond to the same 
power of (i Hence a, . , a" and /?, , /3" are put in a (1, 1)— corres- 

pondence by (3) The roots of f(x) in i?(a) are powers of a, those in GF p (f3) 
are powers of j8, as f(x) is unaltered by (3), the exponents must be the same 
From the corollary m 3-22 it follows that /? p must be a root of f(x) in GF p (/3) 
Hence « p is a root of f(x), where p was supposed to be any pnmenumbei 
which is relatively prime to n Hence the theorem 

It is remarkable that homomorphisms mapping i?[x] on different rings 
GF p [x] help to prove the irreducibihty of the cyelotomic polynomial in R[x], 
though the cyclotomio polynomial may be reducible in each of these rings 

Exercise Show that the cyelotomic polynomial for n = 8 is reducible 
m every GF P . Discuss the factorisation for the different classes of prime- 
numbers p. Show in particular bow the roots are distributed for p = 3, o 
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and 7, and that from the difference of the distribution of the roots in these 
three cases, the irreducibihty of the polynomial in f?[x] already follows. 

3-44 Irreducibihty of determinants As an example of a proof of 
irreducibihty of a polynomial in more than one indeterminate, the following 
interesting theorem will be established 

Theorem. Let K be an arbitrary field in which no term in x 1 i , . . , x* n 
occurs, and let n > 0, then the determinant X = det (x^) is irreducible 
in K [z\, . , x\] 

Proof X is a hnear function of each of the n 2 indeterminates, and 
it is a hnear and homogeneous polynomial in the n indeterminates of the 
first row 


X — A t X 1 , + -+■ -^n xi u 

Suppose X to be reducible, then it must be the product of a hnear polynomial 
in x 1 ,, , x l „ and a polynomial which is of degree 0 m these indeterminates, 

the latter being a common factor of A lt , A „ If n > 1, A, is the deter- 
minant of the (n — l) 2 indeterminates a*,, where s^l.t^l, if n = 1, 
then Ai — 1 Hence A,=fi 0, and therefore it is not divisible by any poly- 
nomial in indeterminates different from x\ As there is no indeterminate 
which occurs in A u A,, , A„ simultaneously, every common factor of these 

polynomials must be an element of K Hence X is irreducible. 

3-5 Symmetric polynomials 

3-51 Elementary symmetric polynomials Let K be an arbitrary 
field A polynomial fix , , , x„) of K[x, , , x„] is said to be a symmetric 

polynomial if f(x v , x„) is not altered by any permutation of x lt . , x„ 

The integral rational functions corresponding to symmetric polynomials 
are said to be integral symmetric functions 

As the polynomial 

'Jl (x X|) — x u 4 a, x + 4- a„ (1) 

does not change bv any permutation of the indeterminates x u the coeffi- 
cients 

dj “ a , (Xj, , x n ) (2) 

are symmetric polynomials. These are -called the elementary symmetric 
polynomials They are represented by 
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a i = 2 *„ = *i + + x ‘< 

a 2 = S arn, ^k 2 


a„ = z Q 

The summation has to be taken over all the systems of different indices 
k u , k, Let f(x) — x u -f- b t a”' 1 -f -f be a polynomial of K[x] 

In a suitable extension of K 

f(x) =ff(v- * k ), (4) 

hence 

= ( 1)' . , <*„) 

The coefficients of /(r) are therefore symmetric functions of the roots 

Theorem Let f(y t , , y n ) be a polynomial of K[y, , , y„], let 

a, be the elementary symmetnc polynomials, defined by (3), and let 
Fix,, , x n ) - /(a,, , «„) Then Pfx,, , r„) is the polynomial 0, 

only if/(.Vi, , y„) is the polynomial 0 

Proo/ Obviously f(y , , , y„) = 0 implies P(x, , , x n ) — 0 Suppose 
now that F{x , , , *„) == 0 , it will be shown that f{y , , , y n ) = 0 

follows from this supposition In a suitable extension of K (y v , y„), the 
polynomial <p(z) — z" — y, z a ' -f + (— 1)” y„ has roots, say a,, , a n 

The elements y, , , y„ are therefore the elementary symmetric functions 

of those roots 


y , — d|(a,, ., or,,), for l—l, ., n 

Put a,, , a„ foi x,, , x n , then 

F(a j, , a„) = f{y v , y n ) 

As y lt , y u are indeterm mates over K, the right hand side is different 
from 0, unless / is the polynomial 0 As however F{x,, . , x„) = 0, so 
F(a , , , a n ) = 0 Hence the theorem 

Corollary Let the coefficients of f{y v , y n ) be integral numbers 
and let p be a pnmenumber • Then f{a u . , o n ) = F(x v . ,,a: n )==0 
(mod p) only if the coefficients of / are congruent to 0 (mod p) 



SYMMETRIC POLYNOMIALS 


143 


Proof The ring GFp ^ , , is homomorphic to the ring of the 
polynomials in a:,, , x n with integral coefficients If F(x t , , x„) = 0 
(mod p), the corresponding polynomial of GF v [x lt . , a-] is the polynomial 
0, hence f(y v . yj corresponds to the polynomial 0 «f GF v [y,, , y„] 
Therefore the coefficients of / arc divisible by p 

3-52 The mam them em 

Theorem 1 A symmetric polynomial can be repiesented in one 
and only one manner as the sum of homogeneous symmetric polynomials 
of different degrees 

Proof Every polynomial of Kj-i,, , ij can be icpiesented as the 
sum of homogeneous polynomials of the same ring, and we can choose the 
summands so that no two of them have the same degree The difference 
of two such representations of the same polynomial us a i ('presentation of 0 
as a sum of homogeneous polynomials, no two having the same degree, which 
is impossible Therefore the repiesentation is unique A homogeneous 
polynomial is transformed by a permutation of the mdeteimmates into 
a homogeneous polynomial of the same degree hence the homogeneous 
portions of a symmetric polynomial art tiansformed each into itself by every 
permutation, and they are therefore symmctnc homogeneous polynomials 

Let P be a permutation of 1, , n, and P' its inveise If two terms 

and . , (1) 

**! - > 

with the same system of exponents are transformed by P into equal terms, 
they will also be transformed by PP' into equal terms and therefore they 
are equal Hence by any permutation, different terms are earned into 
different terms Therefore the polynomial 

2 V *„• (2) 

which denotes the sum of all different terms which one gets by all the per- 
mutations of the lower indices of (1) is a symmetric polynomial By taking 
the exponents in a non-decreasing order, (2) is represented by the symbol 

[<„ , L] = £ *i'" 1 *..*"* (3) 

where every <j 2: 0 , the summation on the right side of (3) has to be 
performed as in (2) The degree of [t,, , <„] is 


m = tj 4" 2t 2 -j- . nt, 


(4) 
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Theorem 2 If f{x ,, ,x u ) is a homogeneous symmetric polynomial 
of degree m, it can be represented by 

f{x iy X n ) = ^ C.ft,, , t'nj, (5) 

• 1-1 

where c t are coefficients of /, and the sum is taken over the N different 
systems of non-negative integers t„ , t„ satisfying (4) 

Proof As by a suitable permutation of the mdeterminates X\, each 
term on the right side of (3) can be tiansformed into every other one and 
different terms arc always transformed into different ones, either each of 
these terms occurs as a term of /, or none of them does If therefore one 
of the terms of (3) has in/ the coefficient c,, then 

/(*!> C, ft,, ,t„] 

is a symmetric polynomial mx„ , r, , where none of the terms of | t , , , t n ] 

occurs By repeating this piocedure with all non-negative integral solu- 
tions of (4), after N steps, the difference of the two sides of (5) is proved to 
be equal to 0 

Mam theorem of symmetric polynomials Any symmetric polynomial 
f(*u i *n) of K[Xj, , z a ] can be represented in one and only one manner 
by 

/(*i> • > x a) = , a n ), (6) 

where cq are the elementary symmetric polynomials, and the coefficients 
of F belong to K 

Proof The symmetric homogeneous polynomials of degree m form 
a module M over K(see 2-61) generated by the polynomials ft,, . , t„] The 
rank of M is therefore not greater than N The polynomials 

at 1 (7) 

belong to M, when the exponents satisfy (4), and from 3-51 it follows that 
the N polynomials (7) are independent Hence they form a basis of M 
So every homogeneous symmetric polynomial can be represented by (6), 
and from 3-52 it follows that the same holds for any symmetric polynomial 
If there are two such representations by different F t (a„ ,, «„) and 
f 2 (a v ,a n ), then the difference must be 0 contrary to 3-51 Hence the 
main theorem 

Lemma The polynomials (3) can be generated by addition and sub- 
traction of polynomials (7) 
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Proof. Let R be the field of the rational numbers Since every po- 
lynomial (3) can be considered as a polynomial of R[x } , , x n ], it can be 

represented as a linear homogeneous function of the N polynomials (7) 
of the same degree Therefore an equation 

°[ti> > t„] = Cj ^7'a i r ‘ + + c N 7T a i 8 ‘ 

holds, where 2 r i = = 2 8i = m, and where e,, . , c N and c > 0 are 

integers without a common divisor =£ ± 1 It shall be shown that c = I 
If not, let p be a prime-factor of c, then not every Cj is divisible by p, and 

*7T a i’ 1 + + c n 'Jf a,"t = 0 (mod p) holds, contrary to the corollary 

in 3-51 Hence o = 1, and the lemma holds 

Since the elements c m (5) are coefficients of f[x it x n ), the following 

theorem is a direct consequence of the lemma 

Theorem 3 The coefficients of F are elements of the ring generated 
by the coefficients of / 

3-53 Alternating polynomials A polynomial /(/, , , x n ) of 

K[x„ ,*„] is said to be an alternating polynomial if by every odd permu- 
tation of x , , , x„ it takes the factor —7 As an even permutation is 
composed of two odd pet mutations, an alternating polynomial is not altered 
by any even permutation of the Hide terminates The product of two alter- 
nating polynomials is a syminetiic polynomial, and the product of an alter- 
nating and a syminetiic polynomial is alternating It an alternating poly- 
nomial is divisible by an alternating (a symmetric) polynomial, the quotient 
is a symmetric (an alternating) polynomial Since every odd permutation is 
composed of an odd number of transpositions (sec 0-3), a polynomial is an 
alternating one if it takes the factor — 7 whenever one transposition of 
its mdeterinmates is performed When the characteristic of K is equal to 2, 
every alternating polynomial is symmetric and conversely , when the 
characteristic is different from 2, the polynomial 0 is the only polynomial 
which can be considered to be symmetric and also to be alternating as well. 

Theorem 1 If a polynomial f(x l; , x n ) of KJaq, , x„] has the property 
that by every permutation of the in determinates it is transformed into a 
polynomial which is divisible by /(x 1; , x„), then it is either symmetric 

or alternating 

Proof Let c pq be the factor which is taken by f(x u ..., x„) when 
x p and x q are interchanged, then the polynomial takes c pq 2 when this trans- 
position is performed twice , but the twice performed transposition does not 

nn /\ t\ * n 
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alter the polynomial, henoe c pq 2 = 1. Hence c p „ — ± 1. To prove that 
in this equation the same sign + or — holds for every pair p, q of indices, 
consider the following identity between transpositions 

(p, q) = (i, p) (k, q) (i, k) (k, q) (i, p) 

From this equation it follows that 

Cpc, — C| P C k q C,K C kq C,p = C | p 2 C kq 2 Cj k = Cj k . 

Hence either every transposition leaves the polynomial invariant, then it 
is symmetric, or it takes the factor — 1 by every transposition, and it is 
therefore alternating 

Let /to, , x„) be an alternating polynomial of K[x t , . , x„], where 
the characteristic of K is different from 2 Replace x k by x,, then one gets 
a polynomial f(x 1 . , x„ ,x it . , x„) which is alternating, but neverthe- 

less invariant for the interchanging of the i"' and the k u indeterminate. 
Since the characteristic is supposed to be different from 2, the polynomial is 
0 Now /to, , x„) = /to, - * 11 ) - /to, , , x„ , x n ) is divisible 

by x t — x k , as one sees by forming the differences of corresponding terms 
The n(n— -1) 2 polynomials x x — x k , where 1 < k, are non-associated prime- 

elements of K[x lt , x,J , hence /to, ■ , x„) is divisible by (x l — x k ) 

i>» 

This product is an alternating polynomial itself , the quotient ib therefore 
a symmetric polynomial Hence the following theorem holds 

Theorem 2 Let /(x,> , x„) be an alternating polynomial of 

K[x,> , *„], where the characteristic of K is different from 2, and 

D = 7T to - A), (1) 

then 

/to,. ,X U ) = DS, (2) 

where S is a symmetric polynomial 
Corollary 

n-1 

7 ) = 

X 1 n " 1 

Proof The determinant is obviously an alternating polynomial and 
therefore divisible by D The second factor S is of degree 0 , its value is 
found to be the umtelement by comparing the coefficient of the diagonal 
term of the determinant with the coefficient of the corresponding term in D, 
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3-54. Symmetric rational functions. 

Theorem 1 Let f(r x , , x a ) and g(x x , . , x„) be polynomials of 
K[x t , . x„], let (/, g) = 1 and / g be a symmetric rational function, then 
/ and p are symmetric polynomials 

Proof Let / be transformed into f lt and g into g t by an arbitrary 
permutation of the indeterminates , then fg, — g/ t From 2-47 and the 
supposition of this theorem it follows that /is divisible by /,, that g is divi- 
sible by g, and conversely As this holds for every permutation of the in- 
determinates, it follows fiom the lemma of 3-53 that / and g are symmetric 
or alternating If one of them is alternating, the other must also be alter- 
nating, for the quotient is symmetric, but m this case the two polynomials 
must be divisible by D, contrary to the supposition (/, g) = 1 Hence / 
and g are symmetric 

Theorem 2 Let F(a i , , a n ) and 0(a l , , a n ) be polynomials of 

Kfa,, . , ct,.], a, being the elementary symmetric polynomials of x u . , x„, 
and let (F, Q) — H(a lt , F = /(a-,, , x n ), 0 = g(x u . , x„), 

H - h{x u , *„) , then is h = (/, g) 

Proof Let h — (f,g) From the preceding theorem it follows that 
/ h' and g h' are symmetric Hence h' is symmetric and can be represented 
by H'(a i , . , a„) Since //' is a common factor of F and 0, it is divisible by 

H and therefore h is a factor of h' , but h' is also a factor of h, for h' = ( f,g ). 
Hence h and h' are associated, and the theorem holds 


3-55 Power sums The symmetric polynomials 

s, — X a/ (1) 

i-» 

are called power -sums From (1) it follows for m < n that 
= a lf 

# m-l a l = 5 l + S x i. a> 1 x : 

s,n-k «h. = s X™-"' X 2 . x k + 2 V k a - /. - 

Si u n i-i = m x/ Xj d" mo m 

Henoe 

V ( - 1)‘ ««.i = -«■+(- l)" 1 ' 1 “<*">• (2) 

1*1 ' 



148 


ALGEBRA i 


Similarly for m > n, 

2 ( l) 1 ®i a m-l — *in (3) 

i-l 

holds 

Theorem. Let K be a field of characteristic 0 , then a m can be expressed 
by Sj, . , 8 m with coefficients from K, and therefore the symmetric poly- 
nomials of Kfo, . , ar„] are polynomials of K[s-,, . , .s u ] 

Proof a x = Sj , by mathematical induction we suppose that a, , , o in . x 

can be expressed m terms of , s m , From (2) it follows that a m can 
be expressed m terms of s , , , s m 

It may be mentioned that this theorem docs not hold if K has a charac- 
teristic p S n, and that the elementary symmetric polynomials cannot be 
expressed in terms of power sums by the help of integral coefficients only 
The power-sums offer a possibility to express D- in a very simple form, where 
D is the alternating function defined in 3-53 By squaring the determinant 
in 3-53, (3) by columnwise multiplication, one gets 

*2n-2 • *n *n t 
Jyj *211-3 *11- 1 *»-2 

* 11-1 *1 n 

Exercise A special problem of the theory of numbers asks for the sets 
of integral numbers a,, , a„ and b lt . , 6„ satisfying the m conditions 


a 1 + 

• + a n ~ hi 

+ 

+ 

af + 

+ a,r = hi 2 

+ 

+ V 

a? + 

+ ad" = K m 

+ 

+ K m 


Prove that for m = n there exists no solution except the trivial solutions, 
where b v . , b n is a permutation of a v . , a„ 

3-6 Solution of special equations by radicals 
Equations of the type 

x a — a = 0 (1) 

are said to be binomial equations If a is a solution of (1), and e is a primitive 
root of x n — 1, then the n roots of (1) are given by <* *>, for j = 1, . . , n, 
and therefore 
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x" -a-Jf (x - a e>) (2) 

1-1 

Hence K(or, f ) is the smallest extension of K admitting the complete reduc- 
tion of x B — a A solution of (1) is said to be an n"‘ root or n" 1 radical 
of a, or more particularly a square root (cubic root, biquadratic root) when 
n = 2 (n = 3, n = 4) If a is a positive number, there exists exactly one 
positive number among the solutions of (1), and in general this solution is 
meant if one speaks of the radical * 

If oq is a radical of an element of K, furthermoie a. is a radical of an 
element of K(a, ) etc , and <*„ is a radical of an element of K(« 1 , , ,), 

then every element of K(a,, , a,,) can be generated fiom the elements of 

K by performing addition, subtraction, multiplication, division and ex- 
traction of roots Questions regarding algebraic extensions of this type are 
answered by Galois’ theory of algebraic equations 'Fins theory is intended 
to be discussed in the second volume Here, only some spenal classes of 
equations which can be solved by ladieals will be investigated 

There is little loss of generality in supposing that m the equation 

x" + b t x “ 1 -f + * + K = 0 (3) 

the coefficient is equal to zeio As — b l is equal to the sum of the roots, 
6j — — (a, -f a„) The transformation 

x' = x + b i n (4) 

transforms (3) into an equation where the coefficient b, is replaced by 0 
The transformation (4) can be performed unless n is divisible by the charac- 
teristic of the field For this reason it is supposed for the rest of this 
section, that the characteristic of K is different from 2 and from 3 

3-61 Cubic equations For n = 2, under the supposition made just 
beforehand, the general equation can be reduced to x 2 — a = 0 This 
equation can be solved by extracting one squaie root. 

Let n = 3 The reduced form of the equation is 

/(*) = x 5 + px + q — 0 (1) 

* The notion of radical 8 also used in the theory of ideals and of hypercomplex 
systems but in a completely different sense , 
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Let K be a field containing p and q, and let in a suitable extension of K, 
the roots of (1) be a,, a. t , a. The elementary symmetric functions of the 
roots are therefore 

Oj = 0, a, = p, dj= - q (2) 

By the formulas of 3-54, one computes easily the power-sums as 

= 0, s, = — 2 p, s, = — 3 q, a 4 = 2 p 1 (3) 

Put D = [a i — a.) (a, — a 3 ) (a, — a 3 ) (4) 

This alternating function of the roots is called the discriminant Its square 
is a symmetric function of the roots and therefore it belongs to K Using 
the formula for Z) 2 , given in 3-53, one gets 

«4 8 3 8 , 

D 2 — s s s 2 s 1 | 

3 \ 

and putting in the values (3) for the power-sums, it is found that 

D 2 = - 4p s - 27g z (5) 

The derivative of f(x) is 

f'(x) = 3a- 2 + p = (X - a,) (x - a,) -f (x - a,) (x - a 3 ) 

+ (J - a,) (x - a 3 ) 

Hence 

/'(a,) =■= 3 a 2 , 4- p = (a, — a,) (a, - a,) = D (a, — a s ) 

Moreover 

— «1 = + «■, 

Hence it follows that 

«2 = i[- «i + 5 (3«, J + p)] 

and 

= it- «! - -D • (3a! 2 + p)] 

are elements of K (D, a,) Since on the other hand D is contained in 

K(a„ a s , a H ), 

K(J5, a,) = K(a], n 2 , a 3 ) 

holds Suppose f(x) to be irreducible m K{*]. Then a 4 is of degree 3 over K, 
whereas it follows from (5), that D is of degree 2 or 1 over K. Hence 
[K(D, «!> . K (D)] = 3 
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To get a, by extracting square roots and cubic roots only, one needs the roots 
of the cyclotomic polynomial x 2 -f x + 1 These are denoted by 

• = ~ i + i v (- 3) and = - \ - \ yj (- 3). 


Introduce furthermore Lagrange's symbols 

(“, a) = a, + a> a. + u> 2 a 3 and (<o 2 , a) = a, f- o > 2 a, 4 *“ a a 

From a t + <x, -f a 3 = 0, and / 4 - <o -f o>i — 0, it follows 

(“>, “) + (“ 2 , “) = 3a,, u>‘ 2 (w, a) + «>(<«■!, a) = 3aj, a>(o>, a) -f oi 2 (a)-, a)= 3a, 

( 6 ) 

By an elementary calculation, one gets 


K a) 1 — — \jq 4 t'I> v (- ,}) 

(w J , a ) 1 — -Jq - !D si (~ J) 


(7) 


("', a) (<u-, a) = — 3/1 


(8) 


From (7) it follow's that Lagrange’s symbols can be obtained by extracting 
cubic roots only, after having extracted the square roots of — 4p 3 — 27 q z 
and of — 3 By extracting a cubic root a factor <» or <»" remains arbitrary 
The two arbitrary factors which occur when the cubic roots of the right hand 
sides m formula (7) are extracted, aie interconnected as is seen by ( 8 ) If 
(<», a) takes a factor <» e , then ("> 2 , a) takes "> ie These factors generate ail 
even permutation of the roots a,, a, By extracting the square roots of 
— 4 p 9 — 27 q- and of — 3, a factor 4: 1 is left arbitral These factors 
may interchange Lagrange’s symbols and generate a transposition of a, and 
a, Thus the formulas (5), ( 6 ), (7), and ( 8 ) determine the three roots 
uniquely up to an arbitrary permutation of them, and the result is obtained 
by mere extraction of square roots and cubic roots 


3-62. Biquadratic equation » For n = 4, the reduced lorm of the 
equation is 

x* -\- j> x 2 qx + r — 0 

Let « t , a Jt a 3 , a, be the roots m a suitable extension of K 


Put 


P t = — («i + «;) (“j + ««) 

Pi = - («i + “j) K 4 “•.) 

P, — - K + “J ( a i + «a) 
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Since by a transposition of any two of the roots, the elements /?,, /? 2 , /S 3 are 
interchanged only, by an arbitrary permutation of a,, <*>, a 3 , a 4 the elements 
Pi, Pi, P, are interchanged too, hence any symmetric function of /?,, p 2 , ft 
is not altered, and it is therefore a symmetric function of u lt a 2 , a s , a 4 
Hence there exists an equation x' + 6, x 2 + b, x + b a = 0 with elements 
from K having the roots ft, ft, ft Of course, there is 

b 4 = 2p, b, — p- — 4 r, b, = — q 1 

Thus it is possible to find out ft, ft, ft by extracting roots only In order 
to find out a lt a,, a 3 , a 4 , one has to consider that 

(“i + a.) + (a, + a P = 0, (a, +« 2 ) (a 3 + «i) = — Pi , 

hence a, + a 3 and a, -f a, aie the roots of x~ — ft — 0 

«1 + ==• V Pi, «, + = — V Pi > 

similarly a, f- a, = yf ft, a, + a, = — ^ ft, 

“i -(- “t = \ P «! + = — yj P t 

For every root, a factor J ; / remains arbitrary, but since 

\ (P, Pi Pi) - K + «•) («l + « ) K + O',) 

- <V("i 1- \- a i -\- «,) I i«i«) «i. = - q 

holds, only two of the factois are arbitrary The choice of these factors 
corresponds to a permutation of a, , a 3 , a 4 as is seen from the final 

formulas 

2<*i =■ V Pi T V ft +- \ ft 2a | = ■ s/ Pi V ft, — \/ Pi 

2«, = V ft - V ft - V Pi 2 «« = - V ft - v ft - V ft 

If one excludes the cases where K is of characteristic 2 or 3, the equations 
of degree g 4 can therefore be solved by extracting square roots and cubic 
roots only 


3-7 Resultants 
Consider the polynomials 

f{x) = x* + a, x-- 1 + . + a n , 

and (1) 

g(x) =x n > + 5, Z'"- 1 + . . + 6 m , 
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where the coefficients belong to the same field In a suitably chosen exten- 
sion, f{x) has the roots a lt . , « n , and g(x) has the roots f2 u /? m . The 
necessary and sufficient condition for f(x) and g(x) to have a common root, 
is that the Resultant 


m g) = 7[ (*i - A) (2) 

UK 

is equal to zero From (2) it follows immediately : 

R (f< 9) = 7f 9( a i) 

\ 

= (-i )" m R(g,f) (3) 

= ( - 1)™” 7T /(A) 

K 

Hence R(f, g) is a polynomial in , , a,, the coefficients belonging to the 

ring which is generated by b l , , b m , this polynomial is symmetric From 

3-52 it follows therefore that R(f, g) can be expressed by the elementary 
symmetric polynomials of tr,, , « M with coefficients belonging to the ring 
generated by 6,, , b m , and since those elementary symmetric polynomials 

differ from the coefficients of {( r) by factors ± 1 only, the resultant is an 
element of the ring generated by a, , , a n , b t , , b,„ Hence 

R(f,g)-R(l,a u ,a n ,l,b u ,b m ) (4) 

The reason why the constant / is used m this notation will become obvious 
in 3-72 Suppose that the coefficients of f(x) and g(x) are polynomials of 
K[y], then R(f, g) is a polynomial of that ring, say R(f, g) = R(y). If R(y) is 
of degree > 0, there exist m a suitable extension of K, elements y % , . , »/, 

such that R(r)j) — 0 The polynomials f(x) and g{x) can be considered as 
elements of the ring K[i,y], say 

f(') ■=/(<•> !/) , <A>) - !/( r < 9) 

Then and g(x,y j) have common roots , thus 

/(£> Vi) = Vi) = 0 

In this way, the common solutions of two algebraic equations of two variables 

f{x, y) = 0, g(x, y) = 0 

can be found by the help of a resultant. 

0/1 


no n t> 
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3-71. Case when the coefficients of the highest term are equal to 1. To 

find out R(f, g) in terms of the coefficients aj and 6„, consider a u , a n , 

fj u .. , j3 m as indeter inmates , the coefficients a } are symmetric polynomials 
in the indetermmates a v , and similarly the cofficients b k are symmetric m 
the indetermmates Every term 

*1 8 11 l X 'in 

A = a 1 . a K 6, b, a (1) 

is a homogeneous polynomial in the n -f m indetermmates t* x , ft m of 

degree 

s i + 2 s 2 + . + n s„ + tj -f . . . + m t m = w . (2) 

The integral number w is said to be the weight of A Let Aj, . A n be 

different terms of the type (1) , then it follows from 3-57 that 

c i Ai -f . . -)- 0^ Au = B (3) 

cannot be equal to 0 unless the N coefficients c, are zero each If the terms 
A i are not of the same type and the coefficients are different from zero, 
B is the sum of homogeneous polynomials in (a,, , /?„) of different 

degrees with non-vanishmg coefficients, and it is therefore not a homoge- 
neous polynomial Since R(f, g) is a homogeneous polynomial of degree 
nm in the indetermmates, it is the sum of terms (1) of the weight nm each 

From R(f, g) — 'Jf g(a t ) it is seen that the term 6 m n has the coefficient 

V 

1 , furthermoie R(f, g) has the property that it is zero when f(x) and g(x) 
have a common root It will be proved now that these 3 properties are 
characteristic for the resultant 

Theorem 1 Let S be the sum of terms (1) of weight nm each with the 
property that S — 0 when }(x) and g(x) have a common root , moreover 
let the coefficient of 6 m n in S be equal to 1, then S — R(f, g) 

Proof Express S as a polynomial m a, , . . , /3 m , then S can be consi- 
dered as a polynomial the coefficients being polynomials in the re- 

maining n + m — 1 indetermmates ^(a,) — ^(/3 h ) is divisible bya t — /3 k 
Since S is zero when a, and J3 k are equal, ^(J3 k ) — 0, Hence S = is 
divisible by ai — /3 k . In the ring generated by the indetermmates a t , . . , j3 m 
the factorisation is unique, and the nm irreducible polynomials a t — are 
non-associate Now S is divisible by each of them, hence it is divisible by 
their product which is equal to R(f, g) Since the terms of S are all of weight 
n m, the quotient must be of weight zero By supposition, the term b m a 
has the coefficient I m S, that is the same coefficient as in R(f, g)\ the quotient 
is therefore equal to 1. Hence the theorem. 
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A polynomial in a u . . , b m with the properties supposed in the last 
theorem, can be found by the following consideration If a is a common 
root of f(x) and g(x), then the following n -f- m equations hold, and these can 
be considered as linear homogeneous equations for a n * m , a n+m ~ t , a. 

a m f(a) = 1 a“ +m -f + a B a m Q a m ~ 1 + . . -j- 0 a = 0 


a f(a) = 0 a"*™ + + 1 + a, a" + . . + a n « = 0 

a" g(a) = 1 a n+m + . + b, n a” -f 0 a"' 1 + . 0 a = 0 

a g(a) = 0 a atm + .... -f 1 a 111 ’ 1 + . + 6 m a = 0. 

1 a, a„ 

1 a l . a n 

1 a, a„ 

1 b, b m =0 ’ (4) 

lb,. b m 

lb, .. b m 

when J{x) and g(x) have a common root The diagonal element is b m a , 
and this term does not occur any more in the determinant , hence the term 
b m " in the determinant has the coefficient 1 The weig 1 1 of the diagonal 
element is equal to nm By interchanging any two columns, the weight 
of the diagonal-element is either unaltered or the element is equal to zero , 
in this manner, one can prove easily that all the terms of the determinant 
have the same weight, thus the supposition of theorem 1 is satisfied Hence: 

Theorem 2 The determinant (4) is equal to B(f, g) 

3-72. The general erne Let 

F(x) = a 0 x u + a, a;"' 1 + . . + a„ and 
G(x) = K + b, x "- 1 + . . -j- b m 

be polynomials, where the coefficients o„, . . , o„, b 0 , . . , b m are elements 
of the same field. 
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By definition , the determinant 


R(F, O) = 


d 0 a x . a n 
d 0 a x . a n 


K b\ b m 

b« ^1 ^ir 


(1). 


18 called the resultant of F and G 


For a 0 = b 0 = 1, this definition tallies with the definition given for 
R{f, g) in 3-71 , moreover R(F,Q) — (— i)'““ R(0, F) holds For any 
further investigation, one has to distinguish 4 eases 

1 Let a o ^0, b„?fizO, a,, , or,, be the roots of F(x), fi t , , /3 m be the 

roots of G(x) 

Put 


F(x) 

«o = /(*) = (* - «i) 

(x - «„) 

G(x) 

bo = g(r) = (r - P t ) 

• i x Pm) 


Then 

R(F, G) = a,™ 6 0 ° R(f, g) = a 0 - b 0 ■ /f - Pi) (2) 

i* k 

Hence R(f, g) — 0 is the necessary and sufficient condition for F(x) 
and G{x) having a common root 

2 Let a a = b a = 0 Then R(F, G) = 0, independent of the existence of 
a common root 

3 b a = b x = . . = b m = 0 Then R(F, Q) = 0 , but as every element 
satisfies the equation g(x) = 0, every root of f(x) is a root of g(x) Similarly 
if f{x) is the polynomial 0 

4 a o =£0, b 0 = — b'.i — 0, b r 0 Put G r (x) = b r x m ~‘ + . . -f b m , 

then R(F, G) = a 0 T R(F, G r ) — a 0 m b T n (a, — (3 k ) , in this case the 

i’ k 

resultant is equal to zero if and only if there exists a common root Similarly 
if b 0 0, a 0 == . . = a s _ x — 0, 0 Hence : 
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Theorem. If F(x) and 0(x) have a common root, then R(F, 0) — 0. 
If on the other hand R(F, G) = 0, then F(x) and G(x) have a common root, 
unless = b 0 = 0 

It appears that the resultant R{F, G) is not completely determined by the 
two polynomials F(x) and G(x) themselves, but by the manner how they 
are represented In the definition of “polynomial” it was stated (see 2-32) 
that terms with coefficients zero should not matter, i e that polynomials 
differing by such terms are considered to be equal By adding terms with 
coefficients zero to F(x) and G(x), one can always arrange that a 0 — b 0 — 0, 
and therefore R(F, G) = 0 This difficulty can be overcome by a rule that 
polynomials F(x) and G(x) should be written m a standard form where the 
highest coefficients are different from zero, unless the polynomial is the 
zero-polynomial A rule of this kind would however imply another in- 
congruity for the case that the coefficients are functions of a variable, say y , 
the resultant would cease to be a resultant for such values of y, for which 
a u (y) — 0 and b a (y) = 0 For this reason, preference has been given to 
the notation given here Thus the resultant depends on the manner how 
the polynomials are represented, and it ceases to be a sufficient condition 
for a common root as soon as a„ = b a = 0 holds 

Example To find the solutions of 

y a : 2 + 2x + y r= 0 
y'- x 2 - 1 =0 

Consider the left hand side as polynomials m x with coeficients from R[y] 
The resultant is 

y 2 y 0 

0 y 2 y 

= y 2 (y + i) (y — i) (y + *\ 3 )(y — 1 V 3 ) = Hu) 

y 2 0-1 0 

0 . y- 0 -1 

For y — 0, there is a, = b u = 0, and the second equation has no solution. 
For the four other roots of <p(y), there is a„ 7 = 0 , 6 „ 0 To each of these 

roots, there corresponds a solution These solutions are therefore 

y — — l, y— i, y = — * V 3 > v = 1 V 3 > 

x = 1, x = — 1, X = Y V 3, x = yj 3. 
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3-73 Linear representation of a resultant. 

Theorem. Let D be the integral domain generated by the coefficients 
of F(x ) and 0(x); then there exist m D[x] polynomials u(z) of degree < m, 
and v(x) of degree < n such that 

«(*) F(x) + v(x) G(x) = R(f, g) (1) 

Proof Consider the n -f m equations 
x* F(x) - a tt x^ + a, a: 04 "" 1 + . . + a u x 11 , for n = 0, 1, . , m - 1 
G(x) = b 0 + b , a;” 1 *"' 1 + . . -f b m x\ for v = 0, 1, ... n - 1 

These form a system of linear equations m x°, . , x t,m ~ 1 ; the determinant 
of the matrix is equal to R(f, g) Multiply each equation with the cofactor 
of the last element, and add Then one gets the equation (1), where 

u(x) — a:™' 1 + . . + u m , v(x) — V 1 a:”' 1 -f . . . + v D , 

and « lt , v n are the cofactors of the elements of the last column Hence 
the theorem 

3-8 Closed fields A field C is said to be dosed if it is impossible to 
extend it algebraically, te if every polynomial of C[x] of degree > 1 is 
reducible. The main-result of 3-8 is that the field of all complex numbers 
is a closed one 

Theorem 1 Let K be a field of characteristic 0. [A > K] = 2 , let 
every polynomial of odd degree of K[ar], and also every quadratic polynomial 
of 4 [x] have a root m A , then A is a closed field 

The proof will be given in two steps At first it will be shown that any 
polynomial <p(x) of K[x] has a root m A, and then the same will be proved 
for the polynomials in A[x]. 

Proof 1. Let \fi{x) be a polynomial of K[x] Without loss of genera- 
lity, one can suppose that <!>(x) is irreducible , as K is supposed to be of 
characteristic 0, the polynomial <p(x) is separable (see 2-65) and its roots in 
any suitable extension are therefore all different. Let n be the degree of 4>{x) 
and n = 2 m u, where u = 1 (mod 2), then the proposition holds for m = 0 ; 
let it be true for m < k, and prove it for m = k > 0. In a suitable extern 
sion, iK*) has the roots 

oq, . • . , (1) 
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For an arbitrary clement c of K, the number of the elements 

a, aj + cfa, + aj), where 1 , j = 1, , n are different, (2) 

is n(n — 1) 2 The ordered pairs of elements aj + a, are all different 
and the non-ordered pairs a lt aj are umquely determined by a, a,, a, + aj 
As K contains an infinity of elements, one can choose c in such a way that 
the elements (2) are all different *and that 


it 


a, a j -f c(a, + a v ) ^ a q -f c(a 8 -j- a t ), 


( 3 ) 


«U + «V ^ “s + «t 

» 

By an arbitrary permutation of the elements (1), the elements (2) will be 
interchanged only Hence every symmetric function of the elements (2) 
is also a symmetric function of (1) and is therefore an element of K Thus 
the elements (2) are the roots of a polynomial <p(x) of K[a;] of degree n(n— 1) 2 
= 2 m_1 u(n — 1) As n is even, u(n — 1) is odd , therefore <p(x) has a root 
m A Let a 2 + c(a t + a 2 ) be such a root a,a 2 is a root of an irredu- 
cible polynomial f(x ) of Kfx], and a, +a, is a root of an irreducible 
polynomial g(x) of K[x\ The roots ft", , /?" of f(x) are all 

different as the characteristic of K is 0, and similarly the roots y' = -f « 2 , 
y", . , y” of g(x) are all different Hence from (3) the elements /?‘ -j- c y k 
are all different We therefore can apply the lemma of 2-72 to the field 
K</9', y') Thus <6' + c y' is a primitive element of K^ja^aj + «■) = 
K(a 1 «, + c(aj + a 2 ) ) , hence a { a 2 and a x + a 2 belong to A a y and a 2 
are the roots of a polynomial of A[z] of degree 2 Hence a, and a t are 
elements of A , <f r ( x ) has therefore roots in A 


2 Let i (/(x) be a polynomial of A[x] To prove that >{i(x) has a root m A, 
consider the automorphisms of A as an extension of K As [A K] = i, any 
primitive element say a of A is a root of a quadratic polynomial which is 
irreducible in K[z]. but reduced in Af* 1 ] to a(x — a)(x — a) Hence A is 
a normal extension of K (see 3-74), and there exists an automorphism A 
of A which interchanges a and <5, whereas the elements of K remain invariant 
(see 2-742) The polynomial ip (x) is transformed by A into p(x) and is 
transformed into a conjugate element Then ip(x) <p(x) = P(x) is a poly- 
nomial of K[x] and has therefore a root, say /? in A Since £ is a root of P (x), 
it is a root of <p(x) or a root of <p(x), if /? is a root of *p(x), then its conju- 
gate is a root of tp(x) At any rate, p(x) has a root m A, and as it is supposed 
to be irreducible, it is of degree 1, therefore every root of <p{x) belongs to A. 
Hence the theorem. 
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It is easy to show that the suppositions of theorem 1 are satisfied if 
K is taken as the field of all the real numbers, and A as the field of all the 
complex numbers Of course, the characteristic of both the fields is equal 
to 0, moreover A = K(t), and therefore [A K] = 2 Let/frr) be a poly- 
nomial of an odd degree in K[x], say f(x) — x 2 "* 1 -f- a, x iu + + a, 1H ,, put 

r=l+|«ij+ +| g>„+j | , 

then there is f(c ) > 0 and /(— c) < 0 Hence there exists a real number b 
m the interval (— c, + c) for which f(b) — 0 holds Thus every polynomial 
of an odd degree with real coefficients has a root in K Finally, every 
quadratic polynomial of A [a;] has roots in A, since one can find them by 
extracting square roots Hence the suppositions of theorem 1 are satisfied, 
and therefore the following theorem holds . 

Fundamental theorem of Classical Algebra The field of the complex 
numbers is closed 

Corollary Every algebraic extension of the field K of the real numbers 
is isomorphic to K(t) 

Pi oof Let A be an algebraic extension of K and a be an element of A 
not belonging to K Since a is a root of a polynomial of K[x], K(a) is iso- 
morphic to a subfield of K(t) different from K, and as [K(i) KJ — 2, 
K(«) is isomorphic to K(i) Hence K(a) is closed, every element ol A is 
algebraic to K, and therefore it must be an element ot K(«) Hence 
A - K(«) 

The fundamental theorem of classical algebra can be expressed also m 
the form "Every polynomial with complex coefficients (w'hich e g may be 
real and in particular may be rational) has complex roots, and can therefore 
be represented as a product of linear polynomials uith convex coefficients ” 
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CONTINUED FRACTIONS 
4-1 General properties of continued fractions 

A real number a > 1 can be approximated by an integral number * 
up to an error < 1, say 

8 g a < s -j- 1 , 

then' 

a = s + 0, 

where either 

/? = 0 , 

or 

1 /?-«'> 1 

In the second case, a' can be approximated again by an integral number and 

so on This method leads to the representation of real numbers by continued 

fractions * which has many interesting properties, these will be considered in 

particular m 4-2 and 4-3 The underlying principle is however more general, 

for the integral numbers as well as for the real numbers > 1 one may 

substitute other systems of mathematical objects The interconnection 

between the system A of the real numbers > 1 and the domain of the 
* 

integral numbers w'hich has been used m the consideration above and which 
will be used m the general portion of the theory is the following only 

1 The system A and the domain of the integral numbers are subsets of the 
same field (the field of the real numbers) 

2 In this field, the domain generates a partition into classes of residues, 
and the elements of A are distributed among these classes in such a way that 
if any element a of A belongs to a particular class, then the same class contains 
an element )3 = a — s which is the inverse element of an element a' of A, 
unless the class of residues is the class (0), i.e the class formed by the domain 
itself 

* Or simple continued fractions Since in this book no other class of continued 
fractions is used, the notation continued fraction is applied here m this special sense. 

69 0. P.— 21 
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Thus the investigation of the continued fractions will start here from 
a field K containing an integral domain S and a subsystem A which are in- 
terconnected in the manner explained just before. 


4-11 Convergents of a continued fraction. Let K be a field, S an inte- 
gral domain in K, and A be a subset of K with the following property If a 
class of residues =£ (0) of 8 contains an element of A, this class contains also 
the inverse of an element of A Hence if 

a, a', a", . ,a u a 2 ,... , (1) 

denote elements of A and 


8 , 8,8 , . . , Sj, 8 2 , . 


( 2 ) 


denote elements of S, then every element of A can either be expressed as 


or as 


Put 


then 


a = s + 1 

(3) 

a = 8 

(■>') 

«x = «1 + 1 a ‘l 


— 8„ -\- 1 a 3 

W 

«„ = «n + * °Vl. 


1 


«l+ 1 


83 + 

(5) 


The representation of by (5) is said to be a continued fraction The 
sequence of the formulas (4) can be continued indefinitely, unless there is 
an element «, which is an element of S If there is an m such that a m = s m , 
then the contmued fraction is called finite, otherwise it is called infinite. 
If a can be represented by a finite continued fraction, it belongs to the 
quotient field Q of S, and every finite set of elements 

^l> • • > 8 m 

of 8 defines an element of Q by the help of (6). 


( 6 ) 
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Determine now sequences of elements of S by the following formulas : 


P_, = 0, Q-x = I 

P„ = 1, Q 0 = 0, (7) 

P|> = a k Pk-i + Pk-2> Qk = ®k Qk-1 + Qk-2> 

k = l,2, .. 


| Pu Pu-i 

Let D k = , 

i Qk Qk-j ' 

f 

then from (7) and (8) it follows that D k = — D k _,, and since D 0 = 1, 


( 8 ) 


Dk = (- 1)* 


(9) 


From (8) and (9) it follows that P lt and Q k have no common factors 
in S other than unities and that 


_ p i_ _ p tj = L~J1 . 

Qk Qk-i Qk Qk-v 


(10) 


The formulas (4) can be transformed into an homogeneous form as follows : 


Let a, be an arbitrary element 0 of K, then a sequence of elements 
a u a 2 , , a n+2 is uniquely determined by 

a i — a i a i*j> (11) 

and by multiplying the equation (4) with a it a 3 , . respectively, one gets 

o, = s, o, t i + a, t2> for i = 1, . . , n. (4') 


From (4') and (7) follows . 

Pk ®kti "1” Pk-i ®k+2 = P|!-i (^k ®k+i H - ^k + 2 ) Pk-2 ®k+X 

= Pk-1 a k + Pk-2 a k+i 

A repeated application of this formula shows that for i < k, 

Pk Ok t i + Pk-k a k , 2 = P| a ul + P,.! a lt2 = P„ a x -f P_ 2 a 2 = a,. 

Similarly (12) 

Qk °k+l + Qk-1 ®k+2 = Qi fliti + Qi-J ®1*J = Qo ffli + Q-J ®2 = a 2‘ 
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Hence 


Pk “k+i + Pli-i _ Pi “l,i + P 1-1 __ 

Qk “k+i + Qk-1 Ql “1,1 + Ql-1 


Multiply the first equation of (12) with Q k _ 1 , the second with — P l( _ x 
and add, then it follows from (9) that 

( — ^) k ®k+l = a i Qk-l a l P|i-l (14) 


From (11), (12), (13), and (9) it follows that 

_ _ p k _ = (- i r 1 a k, 2 

1 Q k « 2 Qk ‘ 


(15) 


Hence if the continued fraction is finite, say a u = and therefore 
“n +2 = 0, then 


“i 



(15') 


The elements s,, . , s n of (4) are said to be the elements of the conti- 
nued fraction : the quotients P u Q k are the convergent s and « n+1 is called 
a complete fraction As a, is uniquely defined by these elements, we shall 
denote a v if a lul exists, by 


«i = K, 


. . , ( a„ t] ) — 


P|l “ n, i P n-i 

Qn “n+i ~t* Qn-1 


and if 4' n is the last element of the continued fraction, by 


p 

“l = {&i> > ^fl) — ~7=. - 

Vbh 

From (4), (5'), (6') it follows that for k g n 

“k = (^ki »^nJ“n+j)> 

or « k = (a k , , sj, 


(5') 


as the case may be Conversely, let a x = (8 1( . . , s k -i|a k ), and « k be given 
by (5"), then the expansion of «! into a continued fraction furnishes (5') 
or (6') corresponding to the two cases of (5') Put k = t, as a t , «, ,, 
are complete fractions obtained by expanding a t into a continued fraction, 
it follows from (1) that « t = a x o Ul , .... o t+1 = a t+i : o ultl 
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The terms P', and Q\ corresponding to a t are determined by 


P'-j. = 0, P' 0 = 1, P'.-Su.-l Pl-l+P'i-2 


(16) 


Q -i — Q'o — o, Q i — Q i-i + Q 1-2 

By applying (12), (13), (14) and (9) to the expansion of a t into a continued 
fraction, one gets therefore the following system of formulas 


a. = P'i «ui + P'i-i «uui. 

a tti = Q'i «ui + Q i-i °t + ui) (i^) 

(- 1 )' = - Q'i-i at + P'i-i a u j, 



_ P'i ‘ 

*ui 4- P'i-i 

t 

Q'i a t t i + Q'i-i 

P'i 

P'.-! 


Q', 

Q'.-i 

- <- iy 


4-12 Finite continued fractions Without loss of generality, one may 
suppose that all the elements of the quotientfield Q(8) of the domain 8 be- 
long to the set A As the terms P„ and Q n of any element of A belong to 
S, their quotients belong to Q(S) In particular, the expansion of P n P n -! 
and Q„ , in the homogeneous form (4') of 4-11 can be derived directly 
from the expansion of a x It is given by the formulas 

Q„ *n Qn-l “l” Qn-2 
Qn-1 — ®u-l Qii- 2 "I* Qn-3 

Q, = s 2 

Q, =1 


— (*J1> ••>*;) (1) 


Pn — *n P il- 1 + P|i -2 
Pll -1 ~ *li-l Pn -2 P|l -1 


Hence 


Pi = *i 

P„ = 1 


Pn 

Pii-i 


— (*1» • J *2) *j) 


If 8 is the domain of the integral numbers, it is easy to show (and it will be 
proved later) that an element can be expanded into a finite continued frac- 
tion if and only if it belongs to the field Q(S) which — in this particular 
case — is the field of the rational numbers It is mterrestmg to study the 
same question under more general conditions ; a close connection between 
continued fractions and the factorisation of S will become apparent 
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Let a 1 be represented by a finite continued fraction a t = (s L , . . , « B ). 
Then a u a, ul = «„ = Hence a„ = s„ a, l+1 + 0, thus a, u2 = 0 From 
(12) and (14) of 4-11, one gets therefore 


®l — P« a n+ ^) tti — Q n ®n t l 
( ^)” ®iui = a i Qu-i a 2 Pn-i 


( 2 ) 


Hence a, = P n . Q„ belongs to the quotientfield of S 


Every common factor of a l and a 2 is a factor of a n+1 , and a lul is a 
common factor of a 1 and a 2 Hence a, and a 2 have an h c f and it can be 
represented linearly by a ± and a 2 Especially a y = P„ Q„ is a representation 
by two relatively prime-elements of S, as 

P„ Q,,-! - Qn Pn-! = (- 1Y 

Suppose every element of Q(S) is representable by a finite continued 
fraction, and let s', s" 0 bo two arbitrary elements of S, then s' s" = a 
can be represented by two relatively prime-elements of S, so that 1 can be 
represented in a linear and homogeneous manner by those elements 

Hence 

s' s" — p q and pq' + qp' — 1 


Therefore 

s'q = s''p and s"pq' -f s"qp' = s" = q(s'q' + s"p') = q3. 

Hence s' = ps, s'q' + s"p' — s So the arbitrary elements s', s" of S have 
anh c f (s', s") = s which is represented in a linear and homogeneous manner 
by s' and s" Thus S is a Euclidean domain (see 2-44) 

If therefore S is an integral domam other than a Euclidean one, not 
every element of Q(8) can be represented as a finite continued fraction, 
though all these fractions belong to Q(S) The system of all the finite 
continued fractions is a subset of Q(S) which contains S, and therefore it is 
not a field, unless S is a Euclidean domam 


Let a function N(s) which takes positive integral values only, be defined 
for every element s ^ 0 of S, and to every pair of elements s, s' of 8, let 
there exist two other elements and s" so that 


s — s' + s" implies 


N(«') < N(a'). 


either 


«• = 0 , 


or 


( 3 ) 
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Then s : s' = (s, | s' . a ") = (s^ j s’ . s’") — . . and as 
N (s’) > N (s') > N(«") . . > 0 

are all integral numbers, the sequence s', s", . . must be finite, hence s • s' 
is a finite continued fraction From these considerations the following 
theorem results . 

Theorem Let S be an integral domain, and let a positive integer N(s) 
be defined satisfying the conditions (3) for every element s of 8, then Q(S) 
will be formed by the finite continued fractions of 8, and the highest common 
factor of two elements a, and a, of 8 is given by a n+1 in (2), where P n ^, 
Q n _] have 'the same significance as in 4-11 

If 8 is the domain of the integral numbers, N(s) = |s| satisfies (3) ; 
hence every rational number can be expanded into a finite continued fraction 
with integral elements, and conversely, every finite continued fraction with 
integral elements represents a rational number In general, the elements of 
S are not supposed to be factonsable , in particular the condition N(oh) > 
N(a), which was stated in 2-42, may not hold If it holds in addition to 
(3), then 8 is a Euclidean domain, and therefore the factorisation in 8 is 
unique (see 2-44) 


4-13. Proper and improper equivalence The formulas (12), (13) and 
(14) of 4-11 show that the complete fractions a,, ar., of a continued frac- 
tion are interconnected by linear equations These can either be expressed 
in a homogeneous form 

— P 1( a Utl + P k , a llt > (— J) k a M — Q K _, « x — P k . x a, 

®j — Qi. «k f i + Qi,-i a k>2 ( — 1Y *).,-■ — “ Qk a i + Pk a J> 

or in the non-liomogeneous form as linear fractional equations 

a = Pi + Pu-, a _ Qu -i u i ~ Pk-i 

Qk a k,t + Qk-1 Qk a l P|, 

The coefficients of this substitution are elements of 8, and the determinant 
is equal to ± 1 These substitutions will now be considered somewhat 
more closely. Let A and B be the matrices of such substitutions, say 


A = 

A' = 




det A = e = 4-7 
det B = e' = d: 1 


( 1 ) 
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let « be transformed by A into and let a' be transformed by B into a", 
then (see pp 43 — 48) • 

1 a is transformed by E into a, and det E = 1 
_2 a! is ” ” A' into a, and det A' = e 

3 a is ” ” BA into a", and det BA = ee' == -f- 1 


An element of K is said to be equivalent to a if one gets it by transform- 
ing a by a linear fractional substitution with determinant ± 1 From 1, 
2 and 3 it follows that this equivalence satisfies the conditions of reflexivity, 
symmetry and transitivity (see 2-12), and therefore this equivalence defines 
a partition of K into classes, so that two elements of K are equivalent if 
and only if they belong to the same class. 


By 



^ the element 1 is transformed into s, hence all elements 


of S are equivalent From 4-11, (8) and (13) it follows that complete fractions 
a u a 1 of a continued fraction are all equivalent So more particularly, 
every finite continued fraction (a lt , s n ) is equivalent to a n = 3„, and there- 
fore belongs to the class containing the elements of S 


An equivalence is called proper, or improper, according as det A = + 1, 
or det A = — 1 As det A = det A', the notion of proper equivalence 
as well as the notion of improper equivalence is a symmetric one By 
combining two equivalences of the same kind, one gets a proper equivalence, 
and by combining two equivalences of different kinds, an improper equi- 
valence Every element is properly equivalent to itself, for the matrix E 
has the determinant 1 If m a class of equivalent elements, an element 
is also improperly equivalent to itself, le if a is transformed into a by E’, 
and det E' = — 1, then an arbitrary element /? of the same class is trans- 
formed into a by B and by E'B One of these matrices has the determinant 
I, the other has the determinant — 1, so each element of the class is properly 
and improperly equivalent to a, and therefore every element is at the same 
time properly and improperly equivalent to every other element of the 
class If on the other hand a is transformed into /? by A as well as by B, 
where det A = 1, det B = — 1, then a is transformed into a by A'B, where 
det A'B = — 1, and therefore a is improperly equivalent to itself, so that 
in this case, every other element of the class (a) is properly and improperly 
equivalent to every other element of (a) If in (a) there are no pairs of ele- 
ments properly as well as improperly equivalent, then (a) must be divided 
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into two classes without common elements , the elements of the first class 
are properly, and the elements of the second class are improperly equivalent 
to a Elements of the same sub -class are properly equivalent, elements of 
different sub-classes are improperly equivalent Hence 

Theorem In a class of equivalent elements, either every element is 
properly and improperly equivalent to every other element, or there are 
two sub-classes without common element, such that elements of the same 
sub-class are properly, and of different sub-classes are improperly equivalent 

As 1 is transformed into itself by (J J), every pair of elements is pro- 
perly and improperly equivalent in the class containing the finite continued 
fractions 


Let « be transformed into itself by A Then 

a = a ' rx - -°* holds, hence a, a- 4- la. — a.) a — a., = 0 (2) 

a, a + a A ' 1 u * ' ' 

There are 3 different cases 


1 = (u i — a,) = a, — 0 In this case A = E or — E 

By these transformations every element is transformed into itself E 
and — E generate proper equivalences 


2 The polynomial (2) is reducible m « This is possible only if a is 
an element of the quotientfield Q of fl 

3 The polynomial (2) is irreducible In this case a is algebraic to 
Q and of order 2 over Q 

From these considerations it follows that there are elements which 
are not improperly equivalent to themselves 


Let the complete fractions a,, . of the continued fraction not be 
all different, say 


«i — a, +1 , then 


a l ■ — ('^tt * i I | ) — (^t +1 » r + & l+i> 

Hence a t can be represented by an infinite periodic continued fraction with 
the period s ul , , s ,, From 4-11, (17) it follows that a, is transformed 

/ P'. P'.-i \ 

into itself bv the matrix D ==? I J , where det D = (— I) 1 , and 

\ Q', Q'i-i / 

therefore a, is made equivalent to itself by that transformation From 

69 0 P —22 
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P a 4 - P 

4-11, (13) it follows that a. = ■ and therefore it belongs to 

Wt-i — yjt-2 

the field Q(a t ) Hence [Qlaf) Q] < 2, thus satisfies an equation of 
degree 2 with coefficients from Q, and as Q is the quotient field of S, one 
can arrange by multiplication with a suitable common factor that the co- 
efficients belong to S The same holds for a Jt 

Let now a periodic sequence a 1 , , a m , a,, . , a m , of elements of 

8 be given Then one does not know whether in any extension of the 
quotientfield Q of 8 there exists an element representable by the infinite 
periodic continued fraction 

(Uj, , a ra , a^, , o m , ), (3) 

and if such an element exists, it may not be uniquely determined in the field 
But if there is a field m which there exists one and only one element a re- 
presented by the periodic continued fraction (3), then 

a — {d\, . , a m , d\y , a m , . ) = (u,, , (l ni ( a) 

holds and therefore a is a root of a quadratic polynomial in The 

same holds for p = (b v . , 6 t |«) = (6 3 , . . , b,, a v , a m , a u . ) as this 
number can be transformed into « by a linear transformation m Q 


4-2 Representation of the positive numbers by continued fractions 

4-21 Correspondence between positive numbers and, rational positive 
expansions Let the elements of the set A be the real numbers > 1, and 
let S be the ring of the integers , then the representation 
« = s -f 1 a' or a — s 

is always possible If a is not an mtegial number, then 

1 < s < a < 8 -f 1, a' — 1 (a — s), 
and this representation is unique But if a is an integral positive number, 

a — s' -f 1 , s' = 0, 1, 2, , there exist two possibilities 

1 8 = s’, a' = 1 

2 s = s’ + 1 (1) 


Therefore m the representation 4-11, (4) of any positive a as a continued 
fraction, 


8i is a non-negative integral number, and • 

- . s u - are positive integral numbers. 

«*-■ 


( 2 ) 
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As has been shown in 4-12, the rational numbers can be represented 
by finite continued fractions 

Let be a rational number SI If a, is an integral number, then 
there are the two representations of aq corresponding to the two cases in (1) 

«, = («' + 1) 

and 

«.=«' + ? 

If is not integral, a, is uniquely determined by a, , if a 2 is not inte- 
gral, a 3 is uniquely determined by a,, etc There is no possibihty for 
alteration' in the sequence « lf a 2 , so long as these elements are not inte- 
gral , a, admits a representation by a finite continued fraction, the last 
complete fraction m it is necessarily integral Let « r ,, (r > 0) be the first 
integral complete fraction which occurs , then a, = s T + 1 « rn is not 
integral, I < a 1+1 = s' + I, where s' > 0 Thus there are two possibilities 
for the continuation of the continued fraction 

« r+ i = (*' + 1). or s rtl = s', « rn = 1, 
and there exist two and only two representations of ar 1 

«1 = (-»!, , * r , s' -f 1), and a i - (Sj, . , , s r , s', 1) 

If a, > 1 is irrational, then s, > 0, and a 2 , a 3 are uniquely deter- 
mined, none of them being rational , hence these exists one and only one 
representation satisfying (2), and this continued fraction is infinite. If 

0 < /? < 1, then ^ - = a > I, and /3 = (0|a) The essence of these consi- 
derations is given by the following theorem 

Theorem 1 Every positive number can be represented as a continued 
fraction satisfying the conditions (2) If the number is irrational, the 
representation is unique and the continued fraction is infinite If the number 
is rational, there exist two representations, one by an even finite continued 
fraction and the othei by an odd finite continued fiaction 

It will be proved now thaf the converse theorem also holds, i e every 
sequence satisfying (2) determines one and only one real number 

Let a,, s 2 , . be an infinite sequenoe satisfying the conditions (2) Put 

P_j =0, P 0 = 1, P t = 8 U P, = Sj P, + 1, P k = S k Pfc.j -f- P k . 2 , 

Q-i = 1, Qn — 0, Q, = 1, Q.. — s , , Q k = s k Q k _, -f- Q k _, 
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From P; > 0, Q! > 0, it follows by mathematical induction that 


Q 2 , Q a , , Po, P 2 > • • >0 and that 
0 ^ P, < P 2 < P, < . 

Qu = 0 <C Qj = Q., < Qs, <C hold 


( 3 ) 


These inequalities hold independently, whether there exists a number 
a L = (ij, , ), or not Suppose in particular that a, exists, then P k • Q k 
are its convergents (for k = 1,2, ) , applying the homogeneous notations 

= a, a ltl as in 4-11, one may select a, > 0 , then every a i is greater 
then zero Therefore (see 4-11, (15)) 


Qu 


(- l) k+1 a„ +2 

a, Q, 


>0 if Jfc > 1 is odd 
<0 ,, k > 1 is even 


Hence 


P P 

1 2 >n ^ ^ UUH — 10 

n - > «i > -k , tor m - i, i, 

' C cjm im-l 


(4) 


(*) 


Now, the finite continued fractions (s,, , * n ) = P n Q n exist at any 

rate (even if («j, s 2 , ) does not exist), and («,, , ,» k ) is its k u ’ convergent, 

p 

provided k < n By applying (5) to a L = one obtains for n > 2m 


P P P 

x 2m w x ii ^ r 2m-l 

O Hence 

Vsm Vcn V.'m-i 


Pj_ ^ _P?SLti_ 

Qi Qa Qanui ’ Qi 


Pi P 4 P 2m 

> Q« > •> Q.,a 


( 6 ) 


for m = 1, 2, 3, 


Thus the quotients — - form two sequences, one is increasing, the other 
decreasing, and every number of the first sequence is less than every 


number of the second one The intervals 
a nest of intervals, 


( _^5;L _P 2 n \ 
y Q 2 n -1 ’ Q 2 n J 


form therefore 


Qu 


Pw 

Qlr-1 


(~ l) k 


(see 4-11, (10)) 


Qk Qk-1 

The length of these intervals converges to 0 Hence there is, for every 
given sequence (2), one and only one real number a, so that 

> a > - holds for m = 1, 2, 

Vf2m W2ra-i 


( 7 ) 
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If there exists a number a, = (s v s>, ), then a L must satisfy the inequali- 

ties (5) , thus by comparison of (5) and (7), it follows from the uniqueness 
of the solution of (7) that a = a v provided that there exists a number which 
can be expanded mto the infinite continued fraction («,, « 2 > ) To prove 

the existence of this number, the following lemma is helpful 


Lemma Let t be a positive integral number, («,, 
(«i, , a„ + 0 = B, then 


t 


A - B >0 
^0 


if n is even, 
if n is odd, 


equality holds in (8) if and only if t = s^ L = 1 


> s a> A n t l) — A, 


( 8 ) 


Proof Apply the usual notations, and put 

Q' = («n + <)Q„-1 + Q.i- 2 = Qn + t Qn 1. 

then 

(Q Qn) Qn-i = and (/Q„ tl Q ) Q n = ts n+l — 1 
From 4-11, (10) it follows that B — P„_, Q„ n — ( - 1)" (Q„ , Q') Hence 

A-B = (A-P„ Q U ) + (P„ Qn-P,,-! Q„.i) — (B — P n . t Q,,.,) 


= (- ir 1 

( ~ 1 

1 

l \ 


' Qn Qiul 

Qn Qii-l 

" Q' Qii-i ) 



/ - 1 

t \ 

ts n i — 1 


= (- i) n i 

'Qn Qn+1 

Qn Q' / = 

< 1)B C Q'- 

( 9 ) 


Hence the lemma 


In a similar manner, one computes the difference (a, , , s n — i) — 

(Sj, , s„) Denote the last convergent of the first term by P* . Q', and 
the other convergents as usual, then (Q n — Q") Q n . x = t holds, and there- 
fore 


K. 


> ty (^ii • • > *n) 


(-D“ / 1 1 \ t 

Qn-1 V Q’ Qn / ~ Qn Q' ’ 


Hence 

(<?j, . , a, i t) (Sj , . , s tl ) , a n , , , a m ), (ID) 

when n is even and t > 0, whereas for an odd n and t > 0, the converse 
inequalities hold Again, let P ( Q, be the convergents of an infinite 
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continued fraction (a,, « 2 > • ) satisfying (2), and let a 1 be the real number 

which is uniquely determined by (7) This number admits an expansion 
into a continued fraction which will be proved to be (s 1 , . . , ) If not, 

suppose a, is expanded into a different continued fraction, and let it be an 
infinite one Then the expansion can be expressed by («i, . . , s n _j> ° f > • )> 
where n > 0, a ^ s a Suppose a — s D = t > 0 If n is even it follows 
from (5), (7) and (8) that 

^ («j, » s n t) (s,, , s n ^j) ^ ii|, (11) 

whereas the converse inequalities hold when n is odd Hence <r > 8 n is 
impossible If a < s„, put s n — <r ~ t, and by interchanging the two 
expansions, one shows m the same way as above that the supposition 
leads to a contradiction Suppose now that the expansion of <*, is finite , 
then it follows from (7) that this finite continued fraction cannot be a con- 
vergent of (Si, s. 2 , ) Hence it can be expressed by (s lt . . , «»_!, t), 

where a ^ <s„ For a > s,,, the same argument holds as in the case of 
an infinite continued fraction Similarly for <x < ,i n , provided the finite 
continued fraction has more than n elements It remains to show that 
(s x , . , s n — t) a, , but this follows from (10) and (7), as for an even n 

( s li > 0 > (#11 i #n) > a ni 

whereas for an odd n, the converse inequality holds Hence a l can- 
not be expanded into any continued fraction which is different from 
(s 1 , s 2 , . ), therefore it is an irrational number , as every positive number 
cun be expanded into a continued fraction, the expansion of a, must be 
(«j, « 2 , ) Thus the following theorem holds 

Theorem 2 Let s 1 , s 2 , .. be an infinite sequence satisfying (2), then 
there exists a uniquely determined number a, = (sj, s 2 , ) This number 

is irrational, and it does not admit any different expansion into a .continued 
fraction with elements satisfying (2) 


4-22 ' Distribution of the continued fraction along the real axis By 
the two theorems of 4-21, it has been shown that to every continued fraction 
satisfying (2), there corresponds a non-negative real number , rational 
positive numbers can be expanded into exactly two different continued 
fractions which are both finite ; irrational positive numbers admit one 
and only one expansion which is infinite, and 0 is expanded into the con- 
tinued fraction with the only element — 0 It is interesting to investi- 
gate the distribution of the continued fractions along the real axis The oon- 
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tinued fractions with one element, «, = 0, 1 , 2, . represent these integral 
numbers and subdivide the positive half of the real axis into intervals of 
equal length 1 The continued fractions with two elements are distributed 

among these intervals as follows As («, , m) = a, -f , lor every positive 

integral value of m, («j, m) is situated in the interval which is bounded by 
a l and s 1 -f 1, and will be denoted by I« 1 For m = 1, the number («,, m) 
is the right end point of the interval, and with m tending to infinity, 
it converges steadily to the left end Thus every interval h 1 is subdivided 
into an infinity of abutting subintervals /«,*,, bounded by (s v + 1) 
and (Sj, s 2 ) which cover /»,, except its left endpomt. As {a v s 2 , m) — 

«1 + («2 + ~ ) , each of these continued fractions lies in I^ 1 s 2 and they 

converge steadily from the left end of /s^ to its right end and intersect it 
into an infinity of abutting intervals /*, « 2 - 3 Consider now the interval 
7»i . »„ which is bounded by (a,, , s„) and («,, , -f 1) By apply- 

ing the methods and formulas of 4-21, one finds easily 

(*^ 1 , • , ^111 0 (A), 1 An) Qn Q n 1 

( 8 lt ■ i s n 4" 1) ( s u • i s n) fQn + Qn-i 

The right hand side of (1) is a positive number which is equal to 1 if t = 1, 
and converges steadily to zero, when t steadily increases Hence the points 
(*i. < , «i), t) converge steadily from (« J; , s n + 1) to (a,, , «„) when t 

increases , they subdivide the interval «„ into an infinity of abutting 

intervals, and these intervals cover « n with the only exception of 

(«,, . , a„) This point is the left or the right end point of accord- 

ing as n is odd or even To every infinite sequence a,, a,, satisfying 
4-21, (2) there corresponds a sequence of intervals /«,, J>r,, which form 
a neat', the convergents of {•?,, s ly ) are alternatively the left and the right 
endpoints of these intervals , hence the nest converges to («,, .s 2 , . ) Thus 
there is a (1, 1 ^correspondence between the positive irrational numbers and 
the nests. The endpoints of the nest -intervals are rational numbers, and 
one gets a classification of the non-negative rational numbers in the following 
way The first class contains the numbers which can be expanded into 
continued fractions with only one element (these numbers are integral), 
the numbers of the second class can be expanded into continued fractions 
with two (but not with one) elements etc., the m' h class is composed of the 
numbers which can be expanded into continued fractions with m (but not 
with m — 1) elements 

The importance of this classification is made evident "by the follow ing 

,tbporem- t *r - v * i— *snji3„s 



17# 


ALGEBRA I 


Theorem 


If s > 0, and -q“" ' < — < ~ c 


jn-i 


Qjn 


, then s ]> Qun Qm-i 


Proof 


0 < 



From the supposition, it follows directly that 


P in-i ^ Pjn^ Pjjn-i 
Q 2 n-1 Q 211 " Q*«-i 




Q211 Qin-i 


and as a and Q 2U _a are 


positive, 0 < r Q „_ x — s P, n l < ~ — , the middle part of this inequality 
r W-sn 

is an integral positive number, Hence 


1 <C i-k % e Q 2 n a 
Vein 

This theorem shows that the closest approximations of a real number 
by the help of rational numbers with limited positive denominators are the 
approximations by the convergents P m Q m 


Since a lies in the interval bounded by two consecutive convergents 
P n Q„ and P nt , Q„ n , its distance from P„ Q„ is less than the length of 
the interval Therefore 


Pn 

Qn 


< 


Pun 

Qiuj 


P n 

Qn 


1 

Qn Qn*i 


< 


1 

Qn 2 


( 2 ) 


hence <* = -&+ — . (3) 

Wn Wn‘ 

where (e| < Q„ Q„,, < 1, and t is positive or negative according as n is 
odd or even Thus 


Pn =• Qn a ~ 


Qn 


(4) 


4-3 Periodic continued fractions with integral coefficients The elements 
of the continued fractions considered in this article and its subsections are 
supposed to be integral numbers The first element is positive or zero, 
the other elements are positive A periodic continued fraction with a period 
j say 

* (6|, . ,b y ,Q j, iflnu ^l? j ^nu ) 

will be denoted by 

(b u ,b , a,, ,a m ) (1) 

In the special case where there are no elements 6j, the continued fraction 
is said to be purely periodic At any rate, (1) determines uniquely an irra- 
tional number , thus it follows from the last paragraph of 4-J3, t^at this 
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number is a root of a quadratic polynomial say f(x) of R[x], where R denotes 
the field of the rational numbers For the irrationality of the roots, f(x) 
is irreducible in R[x\ 


4-31 Expansion of quadratic dements into periodic continued fractions. 
The converse of the last statement is given by the following theorem 

Theorem Let [A i?] = 2, then every irrational number a belonging 
to A can be expanded into a periodic continued fraction 


Proof Let a be the root of an irreducible polynomial 

. f(x) = orf26x + c (1) 

Expand a into a continued fraction, a — (s„ , s„, X) , then (see 4-11, (5')) 

_ Pii X -j- P„_j 
Qn X + Qn-i 

hence a(P„ X -f- Pn-i) 2 + 26(P„ X -f P n -,) (Q n X+Qn.j) -f- c(Qn X-f-Q n . 1 )- i =0, 
thus A is a root of 

where 


gx 2 -|- 2 h x k — 0, 


g h 


h k 



Pn Pn-i , 2 


a b 


b c 


a b 
b c 


Qn Qn-i 

Let fi be the second root of (1) and ,u be defined by 

m -rr-. Qn -i fi ~f~ Pn-i hence 

M Qn JB - P„ ’ henW 


( 2 ) 

(3) 

(4) 


o __ Pn n 4~ P| |-l 

Qn M + Qn-i 


As afi 2 + 2b fi -j- c = 0, the number (i. satisfies (2) From (4) it follows that 

M = ± Qn(Q„/r-p„) ’ (5) 

where ± has to be chosen according as n is odd or even 

From (5) and 4-22, (4) it follows therefore that 

H = ± 1 

.Qn QnM/3 - a) + £ 

= / 1 -j. l ^ 

Qn y Qn Qn-i(^ — a ) + e ' J 

< 0 if Q n Qn-i IjS - «l> 2 


89 0. P-23 
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As Q n increases infinitely with n, and /? ^ a, the root p is negative for suffi- 
ciently high values of n, and as A is a complete fraction and therefore positive, 
A p= k g< 0 Hence kg < 0 for all sufficiently high values of n. To 
each of these values, there corresponds a partition of the positive number 
b 2 — ac = h 2 — kg (see (3)) into two non-negative summands h 2 and — kg , 
to the first term there correspond atmost 2 values ± k, and to the second, 
a finite number of integral factors k and g Hence (3) admits only a finite 
number of solutions g,h,k for which kg < 0 holds Let now n run over 1, 2, 
and consider the corresponding polynomials (2), for each of these polynomials 
h 2 — gk = b 2 — ac as (3) holds, and for sufficiently high values of n, the 
inequality kg < 0 holds Hence the polynomials gx' 1 + 2 hx + k cannot all 
be different At least one polynomial w’lth kg < 0 must occur more than 
once This polynomial has exactly one positive root, which is equal to the 
corresponding complete fraction Hence the complete fractions are not all 
different, say 


a <i — “utm 

Hence = (c,, , = (c,, , c ra ) 

is represented by a purely periodic continued fraction Hence a is represen- 
ted by a periodic continued fraction 

4-32 Purely periodic continued fractions Let a,, be the 

complete fractions of (s,, , s m , <s n , +J , . . , s m+ | v ) As = ci , , for i m , 

every property which holds for the complete fractions of sufficiently high 
index, holds for every complete fraction of index > m From 4-31 one 
knows that the root conjugate to a complete fraction of sufficiently high 
index is negative Therefore this property holds for every complete fraction 
of index > m, i e , for every purely periodic continued fraction In a purely 
periodic continued fraction =£ 0 (see 4-2, (2)), hence these continued 

fractions represent numbers > 1 

A root A of a quadratic equation is said to be reduced if A > 1, and 

the conjugate root satisfies 0 > p > — 1 It will be proved now that every 

purely periodic continued fraction represents a reduced quadratic number 

Let 

a — (®n • > 5 n) = ( a i> • i *n | a ) (1) 

£ ~ (®ni • > Sj) = (#n> > #i | £) (2) 

be two purely continued fractions, the elements s, being the same in both 
the continued fractions, but ordered in an inverse manner. 
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Let Pi Qi be the convergent^ of a Then (see 4-12, (1)) 
p— 5- = (^ni • • j ^i) and — — = (^n> > i ^j) 

“ n-i Vin-i 

are convergents of £ , hence £ == - ^ >n f ~~ — holds Let /3 = — 

Pn-J £ + Qn-i £ 

then 0 > /3 > — 1, and P,,-! /?' 2 + (P n — Qn.j) /T 1 — Q n = 0, hence j8 is a 
root of f(x) = Q n * s + (Q.„_, — P D ) x — P„_, 

As a =± a is also a root of /(a;), and as a > 1, the roots 

Qn a + Qn 1 

a and /? are different The essence of these considerations is therefore . 

Theorem If a is represented by a purely periodic continued fraction ( 1 ) 
it is a reduced quadratic number, let (5 be the number conjugate to a, and 
£ — — P L , then £ is represented by (2) 

Every continued fraction is equivalent to its complete fractions (see 
4-13) , especially every periodic continued fraction is equivalent to a purely 
periodic continued fraction , hence 

Corollary Every quadratic number is equivalent to a reduced quadratic 
number 


4-33 Scheme for calculation In order to find out the representation 
of any quadratic number a by a continued fraction, represent a by 


a = a V — . = s -j- -A—, where »<«<«+! 

b a 

and a, b, s, D > 0 are integral numbers. 


Then 


t a 1 -f- V D 
* =' jT ’ 


where 


a' = bs- a, b' = (D - a' 2 ) • b 
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Starting from these formulas, a simple numerical scheme given below fur- 
nishes the numbers a, b, a', b' . defining uniquely the complete fractions 
a, a', . and the numbers a, s', . . defining the continued fraction As this 
continued fraction is periodic, one pair a, b must be repeated after a finite 
number of steps Then the first period is finished and the calculation can be 
terminated. 

Examples 

1. a = LdlVJL (harmonic section), D = 5 

a b s 

~ 1 2 o < - L+.Vj < i o 

12 1 < L + (1 < 2 1 

£ 

1 2 

The last complete fraction is therefore equal to the preceding , hence 
a = (0, 1) This continued fraction is the simplest, but the least convenient 
one for practical calculation, as the nurftbers P k , Q k are increasing more 
slowly than in any other case 

2 a = yj 26 

a b s 

0 1 5 

5 1 10 

5 1 

hence a = (5, 10) This example is very convenient for quick and exact 
calculation 


Po = 

1 

p 

II 

o 

P] = 

5 

Qi = i 

P2 = 

51 

O 

pH 

II 

'N 

O' 

P 3 = 

515 

Q 3 = ioi 

P* = 

5201 

Q 4 = 1020 

P 5 = 

52525 

Q 5 = 10301 

P u = 

530451 

Q 6 = 104030 
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Hence a = 


530451 
104030 £ ’ 


where 0<e<10 -1 


Therfore a = 5 099019513(60), 


the last two figures being uncertain, (see 4-22, (3)) 


3 a = yj 2 D = 2 a b s 


0 1 1 
1 1 2 
1 1 


Hence yj 2 = (1, 2) As P,„ Q 1( are increasing very slowly, the method 
will be applied m a modified form 

yl 2 = \ f 200 10 If «/ 200 , ,, 

Thus expand yj 200 into a continued fraction D = 200 = 14 2 + 4 
a b n 

0 1 14 < yj 200 < 15 14 

14 4 7 < -L±W 200 <8 7 

4 

14 1 28 < 14 + V 200 < 29 28 

14 4 

Hence yj 200 = (14, 7, 28), 

P„ = 1 

P, = 14 

P 2 = 99 

P 3 = 2786 
P 4 = 19601 


Qn = o 
Q, - 1 

Qz — 7 

Q 3 - 197 

Q 4 = 1386 


V 200 


19 601 

1386 


— e, 0 < e < 


1 

Qi Qo 


< 


1 

28 Q/ 


< 3-10 s . 


yj2 


19601 

13860 


e', = 1 41421356 (4), 


correct up to eight figures after the decimal point, as 0 < r' < 3-10' 1 
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Exercises Prove that (a, 2a) = yj (a 2 -f 1), and calculate yj 2501, 
yj 82, — , yj 17 Calculate yj 3 directly and also by the help of \f 300 
Compute yj (a 2 + i) 

4-34 Reduced quadratic numbers To prove the converse proposition 
of the theorem of 4-32, the following lemma is useful 

Lemma If a > 1 and /? < 0 are conjugate quadratic numbers, a = 
(s, s lt 8 2 , ), then all the complete fractions «j = (a,, s 2 , ), a., = («., ) 
are reduced numbers 


Proof « = s-| , /?=.<) + — - — , a, and /Si are conjugate numbers 

«t Pi 

a, > 1, ----- — s — /? > s 2: 1 , hence is reduced, and by repetition of 
Pi 

this procedure it follows that a»> • are reduced 

Theorem Every reduced quadratic number is represented by a purely 
periodic continued fraction 


Proof Every quadratic number is represented by a periodic continued 
fraction (a, , s, s L , . , a„) Let this number be reduced and let the 

periodicity of the continued fraction begin with s, only (i e let a n )> then 
it follows from the last lemma that (s, s lt . , a n ) is a reduced number too 
We will prove that this is impossible Using the same notations as in the 
lemma we state 


a i — a n+i> 


henoe /?! = fin^i 


a - 8 - f — , « n = 8 a + , hence £ = a + -I- , j8 n = a + 1 


■*n+i 


& 


Pn*i 


— S P> ~o — — S ° P n > 


P 




but as a and «„ are reduced, 0 < — /? < 1, and 0 < — /8„ < 1 hold , 

hence s < < a + 1, and s n < — i- < + 1 From 0, = j8„*i 

Pi Pn+i 


it follows therefore that s = s„ 
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4-35 Expansion of square roots. 

r 

Theorem Let a = >1 be irrational, then 

a — ( s , s lt . , 5 n ), (1) 

s D — 2s, and for i = 1, . , , w — 1 (2) 

«i = «n-t (3) 

hold If conversely (1), (2) and (3) hold, then a is an irrational square- 
root > 1. 

* 

Proof As a > 1 and the number p = — a < 0 is conjugate to a, it 
follows from the lemma that the complete fractions a u a 2 , of a are 

reduced, and therefore purely periodic Hence « satisfies (1) As 

a = s + — * — and — a = p = 3 + * — , a x and p, are conjugate 

“i Pi 


a i — (®n > *n) (4) 

Therefore it follows from the theorem of 4-32 that 

— 1 • Pi = (Sn> , ?i) (5) 

0 = a + j8 = (a + a) + l P, Hence — 1 p, — s + a 

Therefore (s„, , a,) = (2s, s,, . ,s„) (6) 

holds (6) implies (2) and (3) Conversely if a is defined by (1), (2) and (3), 
then (6) holds 

Let cc, and /?, be defined by (4) and (5), and let p — s + 1 Pn 
then it follows from (4) and (5) that a, and P, and therefore a, p are conjugate, 
and from (6) it follows that a + P — 0 Hence a and p are the roots of a 
rational polynomial lx 2 + Ox — r From 0 = 2s, it follows that a > 1, 

and therefore a > 1, and as (1) is an infinite continued fraction, a must 
be irrational 

Corollary Let a = \ (r t) and Pi Q, , be the convergent® of a, 

then 

l iV - r - (- D ku t - 


holds for every k = 1,2, . .. . 


(7) 
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Proof Let a,, .be the complete fractions of a, then = a ltUn 

a kn = 'Skn "f- - - = 2s + — = S -f- a 

a ku*> a i 

But, as a = ^ >un — ^ Pm- 

Oku «kn 4- Qkn-i 


Qkn 


a __ P M (» -f a ) 4~ fkii-i 
Qk«(* 4 <*) 4 Q kn~i 

a-' — f J kn A — Pk II- 1 4 - a (9 


holds , hence 
kn 8 4" Qkn-i — 4*1(11 ) — 0 


As a 2 - i t is lational, and a is n rational 

■Pkn -1 4 fun s Qkn 1 t ~ o 
— Pkn 4- <An * 4- *An-i — 0 

hold Multiply these equations with Q kn t, and — P kn t respectively and 
add, then tP wn 2 — rQ kn - -f- t(P UI1 , Q kn — Q hn _ { P kn ) = 0, whence (7) follows 
directly 


4-4 Applications to theory of numbers 
It is proposed to solve 

ax - by = 1 ( 1 ) 

by integral x and y 

Obviously (1) cannot be solved if there is a common factor of a and b 
different from ± 1 Therefore suppose a and b to be relatively prime 
a b can be represented by an even continued fraction (see 4-21, theorem 1) 

a b ~ {Sit , s^ni) 

® b — P 2 m Q.-n ■ and as a and b are positive and relatively prime 
a — 1*2 mi b = Q ilu) and therefore 

® Qlm-i b Pgm-i — P 2 m Q2m-i Qnn Psm-i “ ( — l) 2m = 1 
holds Hence one gets the integral solutions by 


X — Q*2m*i 4- k b, 
y = P2m-i 4” k a, 


where k = 0, £ 1, ± 2, . .. 
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To solve by integral x and y, 

x l — dy 1 = 1 {Pell’s equation 1 ), (2) 

where d is a positive integral numbei 

From 4-35 it follows that \ d = a — (s, a,, . , s n ) , applying the co- 
rollary, it results that 

PjpT ~ d Q m, 2 - (- l) k ” 

Therefore if n is even (x, y) = (P kI1 , Q uu ) 

and if n is odd, ( r, //) = (P, Ul , Q, M1 ) 

are solutions for eveiy positive liitegtal k . 

Example x- - 20 y- — 1 

\j 26 =. (5, 10) n = 1 

By this method one gets the solutions 

(c y) = (P,, Qi) = (31, 10) 

- (P„ Q.) = (3201, 1020) 

= (P„, Q 0 ) = (530451, 104030) 


*4-5 Continued fractions with elements </>(x) 

The general method of continued fractions, the elements of which have 
been given in 4-1, can be applied to different fields K In 4-2 to 4-4, the 
field of the leal numbers was chosen for K, and the expansion of a positive 
real number into a continued fraction with mtegial elements was studied 
Thus the positive real numbers foimed the system A, and the integral num- 
bers formed the domain 8 mentioned m the general theory The expansion 
of positive numbers into a continued fraction has furnished the opportunity 
for an approximation of real numbers by rational numbers There arises 
the question now, whether m a similar manner, the power series in an in- 
determinate a' 1 can be expanded into continued fractions with polynomials 
in a as their elements, and whether by this method the power series can be 

f Solved by Brahmagupta (born 598 A D ) and mdependeatly by Fermat (1857) 
The name “Pell’s equation” has no historical justification, but it is commonly used 
(L E. Dickson, History of the theory of numbers, Vol II , B. B. Datta and A. N. 
Singh, History of Hindu Mathematics, Part II) 

* May be omitted at a first reading 


69 0. P.— 24 
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approximated by quotients of polynomials “Approximation” always 
means that to a certain entity which is to be approximated, an infinite se- 
quence of approximating entities is constructed such that the differences 
between the approximated and the approximating entities can be made 
as small as one likes Thus an approximation implies that these differences 
are measurable, and a sequence approximating an entity may be made non- 
approximating when a different method of measuring the differences is 
used In the theory which will be given here, the power series are measured 
by their “degrees” which are integral numbers , the measuring does not 
imply any consideration of convergence The theory is therefore so genera] 
that nothing must be supposed about the field of the coefficients of the 
power senes On the other hand, this generality makes it necessary to de- 
fine afresh the sum, the product etc of power series This definition must 
be done in such a manner that it tallies with the definition for the corres- 
ponding operation on polynomials for the ease w'hen only a finite number 
of coefficients is different from zero 

4-51 The field B Let F be an arbitrary field and r an indeterminate 
not included in F The elements of F will be denoted by a, b, c, d, with 
or without indices The elements of the ring Ffa-] wall be denoted by 

f(x), g(x), ( 1 ) 

The integral domain F[ r"| wall be chosen as S 

In order to get a new system A, ereate new elements denoted by Greek letters 

'Hr), X(z), "(a-), (2) 

in the following manner 

</>(*) =«ni" + a,,-! x-'-' f + a 0 f- a , or 1 f 4 a_ u af k 4 - 

= 0 *" +m 4- fiO z" + ” + o B *• + 4- «L k 4 =2 « k a* (3) 

-JO 

This purely formal definition means that to every sequence of co- 
efficients from F with fixed decreasing integral indices 

A n-li 

there corresponds jne of the new elements, and this element will not be 
changed if one puts before a a a finite set of null-coefficients The addition 
and the multiplication of the elements ( 2 ) will be defined now m such a 
way that the elements ( 2 ) for which a k = 0, for k < 0 form a subring 
isomorphic to F [a;] 
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n m n 

Definition Let n in, <p(x) — 2 a k x k , ip(x) = 2 4 x* = 2 6 k x k , 

-oc -oo -so 

where 0 = 4m = = 6„ if n > m , then 

n 

<p(x) -f- i/(x) — X(x) = % c k x* and 


u+m 

<p(x) I P(x) = <»(x) = 5 ; 4 x\ 

-IS) 

where e k = n w + 4, 4 =- S«, 4 i (4) 

t 

n ? l 5 k — in 

The definition is obviously independent of null -coefficients put before , 
the commutative, associative and distributive laws hold, and the subtrac- 
tion is uniquely defined by 

4 = Cu - a k 

Hence for the null-element every coefficient is 0 If for the coefficients 
of ip (x) the conditions 4 — 0, for k -vfc 0 hold, then in ip(x) ip(x) — 2 4 ?t< 

rfi = 6,i u k holds 

The elements (2) foim a ring B and those elements foi which a k>0 ~ 0 
form a subrmg foi which the addition and multiplication has been defined 
in the same way as for polynomials Hence there is an isomorphism I by 
which this subimg becomes isomorphic to F [x] Let y(x) =£ 0, then <p(x) 
has at least one coefficient ^ 0 let n be the highest index of the non-vamsh- 
mg coefficients, then 

n 

<p(x) = 2 a k x\ ffl„ =4 0, 

- -C 

n is said to be the degree of ip(x) Fiom (4) follows directly 

The degree of a product is equal to the sum of the degrees of the factors 

The degree of a sum of elements of different degrees is equal to the maxi- 
mum degree of the summands 

From this remark it follows that a product of two elements 0 cannot 
be equal to 0 Hence the ring B is an integral domain. Now identify 
the elements of F[x] with the corresponding elements (2) For polynomials 



188 


ALGEBRA I 


9^ 0 of F[x], the degree tallies with that of the corresponding elements of B, 
but the zero-element, which is a polynomial of degree — 1 m F[x], has no 
degree when considered as an element of B 

0 

The elements 2 b u x k , b, = 0, for k =£ 0 are identified with b„, and 

- y 

u 

b n <j,(x) = X b 0 a., x k holds Let '!'t,(x) — -f- 0 x‘ k_1 -| . , then 

x k h( x ) — 1 Every field containing B contains the quotient field Q of 
F[x] The elements of B , which are quotients of elements of F[x] form a 
ring whtch is isomorphic to a subring of Q The elements of this ring will 
therefoie be identified with the corresponding elements of Q. So if k (x) is 

n 

identified with and the finite sum X a k a k is identified with the eym - 

III 

n 

loin sum i /<( i ) = 2 a k z k , wliere n k — 0, fin k < — m 
- 

Using these notations, one can extend the algonthmus of division of 
the polynomials to the elements of B 

Let i l>(x) and f(x) be of degree n and m respeitivelv, where n > m. and 
a n and b m their highest coefficients, 

X — = fn-m. <h(*) =• f /'(*) - Co tu 3° " '!'(?) « of degree n, < m 

A icpetition of this procedure furnishes 

= <>1(3-) — c« t m x n r™ f(rl — ?>(•*) - (r n m t"-»" -j- c.^-m ^(r), 

and by further repetitions one gets an enumerable set of elements r k of F 

n-m 

for kgn - m, so that X,.(x) — 2 c k x k , and 

U -111 
/’ 

= '!'(*) - X„(r) <//(x) is of degree n 1H , < n„ 

11-m 

Let X(j-) = 2 c k ir k = x„(x) -j- u,.(x) then is of degree n,, +1 — m 

-on 

and therefore i//(x)v„{x) is of degree n,, +1 <f>(x) — xj/(x) X(x) = <j,(x) — <f{x) X, (x) 

— <f(x) <o,,(x) is of degree < n, for every v, hence this difference is 0 
Hence <l>(x) — f (x) x(x) holds As iji(x) was supposed to be an arbitrary 
element 9^ 0 of B, it follows 
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Theorem. The set B of the elements (2) is a field containing the 
quotient field Q of Ffx] 

4-52 Expansion of the elements of B into continued fractions. The 
general theory of continued fractions which has been explained m 4-1 can 
be applied to the expansion of elements of the type <p(x) if the field B is 
taken for K, the domain jF[>] is taken for S, and the sj stem of the elements 
with non-negative degree is taken for the system A It is easy to prove that 
those elements satisfy the conditions required for elements of a system A 
(see 4-11), Of course, an element is of positive degree if nnd onlv if its 
inverse is of negative degree 

Now 

n n -1 

<l>(x) — X rt k X* — % r* + X *k ** =- f(x) r 01 (*) 

-cr (» -co 

Hence Either < ( .(r) = f(x), 

or < j,(x) — /(.» ) | 1 where degiee(0(a))>O 

The representation (1) of i/.(t) is uniquely determined Theie exists there- 
fore an expansion of <j>(x) into a continued fraction The fiist element 
s, = f(x), may be of any degree > 0 oi it might be the zero-element, vheicas 
the second element s, (if any) can be supposed to be of positive degree , 
the same holds for s t , If one makes this supposition about the 

degrees, the expansion is uniquely determined Hence 

Theoiem The elements 4 51,(2) can be represented in one and only 
one manner by a continued fraction («,, s„ ) wheie «, is a polvnomial in a, 
whose degiee is > 0, for i > 1 

The degree of a polynomial s has just the properties of the function 
N(s) in 4-12 From that section and the preceding theorem therefore 
lesults 

Corollary The finite continued fi actions represent the elements of Q 
and every element of Q is represented by a finite continued fraction 

4-53 Approximation hy rational functions The elements of B can 
be approximated by finite continued fractions whose elements are polyno- 
mials The theory of approximation which will be given now, has much 
analogy with the approximation of real numbers by finite continued frac- 
tions with integral elements explained in 4-2 , the same formulas of the 



190 


ALGEBRA I 


general theory (see 4-1) will be applied Whereas the real numbers have 
been approximated m such a way that the aboslute value of the ‘ error” 
was made smaller than any positive number, the elements of B will be 
approximated with an error whose degree diverges to — 150 If the field F 
of the coefficients consists of numbers, the functions of x which are the 
elements of B are numerically approximated by the continued fractions 
for absolute values of x which are sufficiently high Thus the method can 
be used for a numerical approximation of the asymptotic behaviour of 
those functions 


Let an element oq of B be expanded into a continued fraction (s u s 2 , ) 

satisfying the conditions of the preceding theorem Denote 

d, — degree for 1 — 1, 2, (1) 

Hence dj , d^, ate positive integers Apply the notations of 4-1 , then 
it follows from a, — s, | 1 a„, that 

degree a, = d, (2) 

As furthermore a, = a, a, ti , it follows that a, = a, a s — a, a , a, = 
= a, « k4J , and therefore 

degree (a, a k+ ,) = d , f + d k -f d M (3) 

From the general formulas 

Q, — I) Q2 = b,, Q, = s, Q,_, -f- Q,_j, 


it follows that foi 1 = 2, 3, , Q, has a positive degree By mathematical 

induction, it follows that these degrees increase steadily with 1 , and in con- 
sequence 


degree Q k = degree (\ Q k _, ) = d k degree Q Ul 

= dj + + d K 

Now 

a _ p k _ = (- ir i 

Qk «.■ Qk 


( 4 ) 


From this formula together with (3) and (4) it follows that 


degree 




2 degree Q t 


( 5 ) 
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The right hand side of (5) can also be expressed by —degree (Q k Q kll ) and 

( P P \ 

,, k J'* 1 - ) This shows that when a, is 

Mii ' 

approximated by its k 1 " convergent, the degree ot the error is the same as 
the degree of the difference between the k ,h and the (k + l) 1 " convergent 
It is easy now to prove a theorem analogous to the theorem on the approxi- 
mation of real numbers which has been established in 4-22 


Theorem degree 


/(*) 

9 {*) 


> degree 




implies 


degree g{ r) > degree Q k 


Pi oof Put S— 

R ffM 



As /3 is the sum of two elements of B, where the first term has a higher 
degree than the second, its degree is the same as the degree of the first term, 
therefore it follows from (5) that 

degree /? = — d K ,,— 2 degree Q,, (6) 


On the other hand, /3g{x) Q,, is a polynomial in r and is different from 
zero Hence 

0 5 degree (Pg(r) Q k ) = degree fi f degree g(r) • degree Q k , 


putting in the value of degree fi given by (fi), one gets 

degree g(r) g degree Q k -f d kfl > degree Q, 

Exercise 

i log ^ 1 + 3 + l * + 


Represent this function by a continued fraction and approximate it 
by rational functions 


4-54 Continued fractions whose elements are polynomials To build 
up the theory of the expansion of the elements of B into continued fractions 
analogous to the corresponding theory for real numbers, it must be shown 
that every infinite continued fraction of the type considered here is indeed 
the expansion of an element of B For this purpose the following lemma 
is helpful 
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Lemma Let <* = («,, = (*'i, ..) be finite or infinite conti- 

nued fractious , let m be the lowest index for which g m g' m holds, or 
for which but not exists, and let («,, , s lu ) — A, (s'j, ., = 

A', then 

degree ( a — a') = degiee {A — A') holds (1) 

Proof Whithout loss of generality suppose that degree .s m = r5 
degree s’,,, — i' The ordinary notations will be used foi the convergcnts 
of a, and those of a' will be distinguished by a dash 

Then P, — P'„ Q, = Q', foi 1 < m , 

Qu, — Qm-i ; Qm 1 , degree Q m = r + q 

Q'n, - Qm-i r Q„, ., degree Q',„ — l' + q 

Now 

( J V , (- IF \ 

y Qm l Qm -1 Q m J 

t 

( __ 1 \m ' m_ — *‘ t m 

1 ' 'On,' Q'n, 

Two different cases have to be considered 

1 Let r — r' , as degree (.s'„, — s m ) > 0, 

degree (A — A') ^ — 2( r ' h q) — degtee J 

V in" 


A - A' - 


- D" 1 \ 
1 Qm / 


2 Let i > i' , then degiee — ,s,„) = r, and tlierelore 
degree (A - A') — r - (l -f q -f r' -\ q) > — 2(r' + q) = degiee — . 

Q m J 

Hence 

degree (A — A') S degree / 

Q in'* 

holds m every case 
From (1) it follows that 

degree (A -a) = degree * < degree -- g degree * , , 

VJm Vm+i Vjni V m 4 

and that degree (A’ - a') = degree — < degree Hence 

Vi m Vi m+1 ^ rn 
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degree (a — a') = degree [(A — A') — (A — a) + (A' — a 1 )] = 
degree (A — A'), as the degree of the first one of the three summands is 
greater than the degrees of the two other summands 

Theorem Let s l ,s 1 , be an infinite set of polynomials of F[x\ and 
let for 1 > 1, degree s > 0, then there exists a continued fraction 
(•V ) 


Pioof. The set a,, determines uniquely the values P u , Q,, , 

and P N Q n — (f, , , -s) Lot 1 < n < N , from the preceding lemma it 

follows, that degree (P„ Q\ - P„ Q„) =■ degree (P„ t , Q„ n - P„ • Q„) = 

degree | * j = — k„, whete k„ mei eases to infinity, with n 

\ Qn Qn+j / 


Ps Qn- 


ni in ,c n 

i, .r k = X K F J- v c,, a:" 
r i-k„ -s 


The coefficients b k aic independent of N As k„ increases with the index 
n, one gets an infinite set b m , . , 6 i M> detei mining 

111 

v(-r) =- S 

- f 

Finally one has to prove that <i>(x) — ) Let <t>(x) — . ), 

and m be the smallest index foi which then it follows from the 

lemma that for every n > ni, 


degiee - K, , ■■>,,)) =• degiee (<;.(*) - («,, , s,„) ) 

holds, as both the sides are equal to degiee ( (.s',, , s',„) — (s,, , «„,) ) 

But <i,(x) — , «„) =■ b i, n r k n + 6 k„ i + is of degree — k„ 

which decreases infinitely with n Thus there is no index like m , hence 
the theorem 


4-6 Continued fractions with rational element* 

Let S be the field of the rational numbers, then every finite continued 


fi action 



( 1 ) 


1 


- + 


(- iv 1 

Qu-i Qu 


— Si 4 


Q t Qi 
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represents a lational number, but an infinite continued fraction determines 
a number by approximation if and only if 


Qn i Qn 


( 2 ) 


convetgpj) If the sum (2) is convergent, 

) (») 

determines a real numbei equal to (2) A necessaix condition tor the' 
convergence of (3) is therefore 

|Q„ *<* W 


If the nunibeis Q„ are either > 0 each, oi < 0 each, oi of alternating 
sign, the sum (2) is an alternating sum , hence in these cases the continued 
fraction conveiges if |Q„ Q„_,| inei eases steadily to infinity 


4 01 Comeryence of continued j meltons For the continued fractions 
considered above, some cnten-i of comeigence will he established now In 
these investigations, everv sequence (senes, continued fraction) which does 
not converge to a real number will be called < Iv'eiyent 

Theoiem 1 If | is conveigeid, 4-0, (‘1) lsdneigent 

PiooJ At first, it will be shown by mathematical induction that 

Qn < ij (H hi). (1) 

As Q, — 1, Q,, — the formula holds foi n <0 If (l) is true tor n < m, 
if follow's from 


Q'li — ,s 'm Qm-x t Qisi-j, 

that 

133-2 jf) 

I Q«. I < n (i +l«il) { |®n. K 1 +|« m .r|) + 1} < n (i -Fh|) 

It converges, the infinite product 11(1 + 1 8 } | ) converges to a 
positive number Q, and |Q„j <Q holds for every index n Hence 4-6, (4) 
does not hold and the continued fraction is divergent 
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Let «j > 0 for j > 1 From Qi - 1, Q> = s 2 > 0, Q n = «„ Q«-i ~f Qn -2 
it follows by mathematical induction that each number Qi > 0 

Q„ Qn-t = ’’.I d Qh-i Qn-.* > Q.i-i Qn-j ^ 9 

4-6, (2) is therefore an alternating senes, whose elements have steadily 
decreasing absolute values This series converges therefore if and only if 
4-6, (4) is satisfied These considerations lead to the following theorem 

Theoiem 2 Let s, > 0 for i > 1, then the continued fraction 4-6, (3) 
is convergent if and onlv if Ss, ls divergent 

Proof If vs, — ^ | s, ) is convergent, the continued fraction is diver- 
gent, as has been proved bv the preceding theoiem 

Let be divergent, then co 4s Q, >0, Q, = 1, and 

Qjh+i ~ •“> >n*i Qjii + Qjij.,, we get bv mathematical induction that Q,..,,,, ^ 1 
Hence Q, , — s JU Qj„ , ^ Q.. n -i 5 s.. n 1 Q.n e >nid as Q, — it follows 
b\ matheinatiral mdurtion that 

Qm 3 i. •Ll (-) 

J 

If the sum (2) divetges with steadil\ im ceasing «, 

Q 2i i Q’n ~ * °® j and Q_>„ Q>mi —* & > (•!) 

and thoiefoie 4-6, 14) is satisfied it howevei (2) conveiges, then v*' 2j[| 
diverges, and therefoie Q.,,,,, -r «•(*! s »„ , , ) diverges with increasing 

» Thus 4-6, (4) holds in everv ease, and this condition implies the conver- 
gence of the continued fraction in the case considered here Hence 
the theorem 

4-62 7W* of iri'itwiuihty The pieceding investigations can be used 

to prove the irrationahtv of certain numbers 

Let «, — 0, a 1 for i>l As is, — » cc it follows from the preceding 
theorem that the continued fraction converges The limit lies between the 
convergents with odd and those with even index, i e between P, Q, = 0 
and P, Q, = 1 It will be shown now that this value is v rational 

Let aj = («,, s„, . ) be rational, say a, = a * , where a,, a 2 are integral, 

u, 

then a, > a 2 and = - , hence «« = a L. — «. = 

a x s, + « 2 a 3 o 2 
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where a 3 = a, - s 2 a 2 is integral As a 2 = (0, s a , . .), there is 0 < a 2 < 1; 
hence a 2 > a s > 0 In the same manner, one gets 

1 a 

a 2 — —— - — , a t — (0, <t 4 , . . ) = — 4 — and a , > a t > 0. 

Oj 

This procedure must end after a finite number of steps, as a sequence 
a i > ® i > ><*!>••! of integral positive numbers cannot be continued 

indefinitely Since the continued fraction has been supposed to be infinite, 
the assumption of a, to be rational leads to a contradiction Hence <*, is 
irrational 

Example. s, = 0, s, = 2, a, = 1, and for m > 1, 


m — 


2 4 . 
1 3 


(2m - 2) 
(2m - .1) 


> 1, « 


21114 1 


then 


3 5 . (2m. - 1) 
2 4 (2m - 2) 


1 , 


0, = 1, Q, — 2, Q, - .3, 

and from the identities 

2 4 ..2m =13. (2m - 1) 4 2 4 . . (2m - 2) 

1 3 . (2m 4- 1) = 2 4 2m + 1 3 (2m - ]). 

it follows by mathematical induction that 


Qi-m = 2.4 2 m, Q, m „ =1 3 . . ( 2 m 4 1 ). 

hence 

Qn Qn-i = W ' i 

thus the continued fraction is irrational Its value is 


J. 

Q, Qi 


l 

Q 2 Q, 


+ 


i (- i) u 

Qn-i Qn 


= £(— l) n n 1 = e' 1 


Hence e is irrational. 



CHAPTER V 

APPROXIMATION OF ROOTS 
Introduction Another dialogue 

Student When I started reading Algebra, you advised me to study 
carefully the systems of linear equations , so I did I further read general 
algebra and continued fractions At first it was hard work, but later on, 
I was quite successful 

Tutor All right, but I don’t think that you have met me for this 
You are looking rather despondent 

St Indeed 1 am again in the wilderness 

T Why t Is there some difficulty in the book, which you want me 
to explain '> 

St It is not for that, but the whole subject became problematical 
to me again 

T I wonder how 

St Yesterday an engineer asked me to solve a certain algebraic equa- 
tion I replied that in consequence of the fundamental theorem of general 
algebra, there exist roots m a suitable extension of the field of the coeffi- 
cients, and that for the fundamental theorem of classical algebra, this ex- 
tension can be chosen as the field of the complex numbers Thus there exist 
complex roots of the equation, and some of them might be real 

T Was the engineer satisfied by your reply 1 

St. Not at all ' He said that I seemed to be a great philosopher, 
and that I had missed the point completely He was interested in real 
roots only, and he had no doubt about their existence He has found out 
that the force (expressed in kilogrammes), acting on a certain part of an 
engine, was bound to satisfy that equation He was asking me to compute 
that force, and nothing else 

T. And you could not , the polynomial was too complicated 



198 


ALGEBRA T 


St It looked very simple Something like x' + 4-r 3 + 2# -j- 6 
From Eisenstem’a theorem it is irreducible, and therefore its real roots 
must be irrational , this T told the engineer 

T Perhaps, the good man did not know anything about irrationality 

St He did ' but he w as not at all interested in my statement He 
said “1 don’t want to have an infinity of decimals, even if you can 
pi ovule me with them , compute the kilogiammes, 1 leave the grammes 
etc to you Now lor any positive j , the polynomial takes positive values 
only So 1 told him that the ieal roots of the polynomial must be negative 

7' And, was this statement of any use to the engineer ' 

St No, lie knew already that the force was dnected to the negative 
side, and then he said "The duet turn of the foice is not very interesting 
to me, as there is little difference whether the mateual is exposed to stress 
or to pressure if you give me a solution with 110 ° \ of error and a wrong 
sign, I could make some use of it. but vom philosophical talk is worth 
nothing ” was quite rude eventually 

7' And vou ' 

St I am bewildered \ftei having lead about 200 pages of the book, 
I am still unable to solve a veiy simple algebraic problem, not even if 30% 
of error and a wrong sign ore admitted ' Though I got very interested 
in algebra, the engineer’s aigument has impressed me , 1 am afraid that 

all my hard work has been spent uselessly 

T I rathei think vou have stopped leading at the wrong place If 
you continue, you will be able to provide vom friend with a solution which 
has considerably less than 30% of error 

St 1 had aheady a glance on the next ehaptei but 1 do not see any 
connection between its content and the preceding parts of the book, * y 
with general algebra, and then, there is another thing yvhich strikes me 
Every solution is given only approximately I should hke to know the 
solutions correctly If for a particular application a few r decimals only are 
requested, then I may neglect the higher terms of the correct result, but as 
a student of Pure Mathematics, I must know at first the proper solution 
before admitting some error for the sake of abbreviation 

T . How do you want to represent the solution if it happens to be an 
irrational number ? 



SECOND DIALOGUE 


199 


St There are many ways of expressing irrational numbers. For 
instance, yj 2 m irrational , it cannot be expressed as a latio of two integers, 
but nevertheless yj 2 is a number Everybody knows what is yj 2 

T Suppose that L do not know it, and tiy to explain ' 

St yj 2 is the positive numbei, the square of which is equal to 2 

T Well, I take it foi gi anted that one and only one such positive 
number exists Let x be positive and x 1 = 2, then r ~ \' 2, or \ 2 is 
the positive toot of x J - 2 — 0 I think this statement is completely 
equivalent to vouis 

St It is 

T You seem to be .satisfied with this manner of expiessmg irrational 
numbers 

St Ot ionise, I am It 1 could expiess the loots of every algebraic 
equation in a similni \uu then there would be nothing to complain of 

T Mv point is that m this ease, the loots ate expressed bv a tautology 

St I cannot follow vou 

T Listen, which are the toots of the polynomial r- 2 ' 

St yj 2 and — yj 2 

T Whereby \ 2 is nothing else than a symbol for the positive toot 
of r- — 2 Besides the statement that i J — 2 has two real toots and 
that their sum is equal to zeio voui solution of the pioblem to find the 
loots ot r- — 2 is a mere tautoiogv Your eonelusion goes like this 
“Who is Amal r ’ “The brothei of Bnnal” — And who is Biinal 
“Amal’s brothei ’ That means onlv that there exist two brothei s Amal 
and Bimal, but it does not explain who is Amal 

St But yj 2 is a well known numbei , mathematicians have got used 
to it, and they calculate with yj 2 as they do with 22 or ] 7 Foi me, 

there is no problem about \ 2 

T Is it for the symbol yj , that vou hold this opinion ' 

St \ as a symbol is a mere convention , the mathematicians could 
use any other notation instead of it, but I do not see any reason w’hy symbols 
familiar to everybody should be replaced by new ones 
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T. I fully agree with you, but for the sake of our conversation, let us 
denote the real roots of a polynomial f(x), as far as they exist, by [/(a;)],, 
[/ML. . [/ML m their order of magnitude starting with the greatest 

root '['hen your explanation of \ 2 means simply that yj 2 is equal to 

M - 2], 


St Now you want mo to admit that for every j(x) which has a real, 
root, the symbol [/ML must be considered as a solution of the equation 
j(r) = 0 You propose that there is no higher justification in considering 
y 2 as a given numbei, than eg M -f- 4x'’ -1 ix -j- 6], But there is a 
huge difference between these two eases 

T How that ' 

St We know more about \ 2 than that it is positive and that its squaie 
is equal to 2 

T What do you know about \ 2 ' 

St s' 2 - 1 4142 

T 1 4142 is a iational numbei, wheieas \ 2 is lirational 

St Celt airily , it is an infinite decimal fiaction, but they have computed 
2UU decimals oi even more of them You cannot deny that \j 2 is very well 
known 

T Theie still iemaius a certain euoi 

St But a negligible one ' 

T That depends on the pui pose of the calculation 1 was told that 
ceitam students of Pure Mathematics must know the proper solution before 
admitting some enor for the sake of abbreviation — Was it not so ? 

St But \ 2 is uniquely determined as the only positive root of x s — 2 

T Yes, theie exists one and only one such root This is a statement 
on existence and on uniqueness, but nothing more than that I think we 
have agreed already about this item On the other hand, I admit that we 
know more than that about yj 2 For instance \ 2 is approximately equal 
to 1.4142, or to put it more clearly yj 2 lies between 1 4142 and 1 4143 
One can find out easily smaller intervals where yj 2 is situated ; there is no 
limit to the improvement of the approximation, and the diminution of the 
error This “error” is not a kind of “mistake” which is the result of a negli-. 
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gent treatment , on the contrary, it is an essential part of the solution of the 
problem One cannot determine irrational numbers otherwise than approx- 
imately This fact is concealed by some symbols we are using Numbers 
represented by them are uniquely determined in the sense that there exists 
one and only one such number, but one cannot determine the place of an 
irrational number on the real axis otherwise than approximately 

St Thus a formula like ( 14 -f 170) is only a “recipe” how to 
determine an irrational number approximately 

T The formula denotes the greatest root of x 1 ' — 28-r 1 -f- 26, and it 
shows of course a recipe how to compute that number approximately A 
mental calculation furnishes 3 as a first approximation which could satisfy 
your friend completely 

St Suppose, one could iopiesent eveiy loot of a polynomial by the help 
of similar symbols , this would furnish recipes to determine every root 
approximately 

T As a mattei of fact, not eveiy loot is lepiesentable m that manner, 
and even if it is, one prefers a diffeient method sometimes 

St My impression is that those methods have no connection with 
general algebra Theory of approximation and general algebra apparently 
belong to different branches of Mathematics if not to two different Sciences 

T They aie complementary to each other You already mentioned 
the two fundamental theorems which state the existence of roots in certain 
fields They must be supplemented by investigations about where the roots 
ara situated m the field The methods of investigation must tally with the 
structure of the paiticular field under consideration , they cannot be of a 
general natuie The real numbers are ordered linearly, whereas the com- 
plex numbers correspond to the points of a plane Hence one subdivides 
the real axis into intervals to determine real numbers, and similarly the 
plane is subdivided into ceitam domains (eg rectangular or circular ones) 
to locate complex numbers Both ways lead to an approximate deter- 
mination of numbers e g toots of a polynomial 

St As the n loots of the polynomial .r u -f « B -i •£“ 1 + -f T. are 
uniquely determined by the numbers a 0 , , there must exist func- 

tions /(«„, . , « n -i) which show the distribution of the roots in the complex 
plane One should investigate these functions , this would be a worthy 
continuation of general algebra 
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T. If you take “function” in the most general sense, there exist indeed 
such functions f(a 0 , . . , a n _j), e g the set of the roots itself is one. The 
problem is how to represent those functions ; you should not expect that 
all of them are polynomials in a u , . . , a„_, . Every polynomial x“ + 
a nl x”' 1 + , + a 0 can be represented by a point P = (a 0 , . . , a n _,) of an 
n-dimensional space There are theorems stating that if certain inequali- 
ties in a„, . , a n -i hold — i e if P is situated in a particular domain of the 
n-dimentional space — the n roots are distributed in the complex plane 
in a particular manner These investigations are veiy interesting, but at 
the present time, the approximation of the roots is based more on the me- 
thods of calculation than on these theorems In many cases, the theorems 
seem to be the result of the practice of calculation For this reason, the 
author has started from Horner’s scheme which gives the clue to the whole 
theory I advise you to work out many numerical examples , it will help 
you to understand the theoretical portion 


5-1 Horner's scheme 

The theory of approximation of the roots of a polynomial /(x) with 
real coefficients is based on the fact that /(x) represents a continuous func- 
tion if x is considered as a real variable As a consequence, the function f(x) 
takes all the values between f(a) and J\b) in the interval (a, b). In parti- 
cular, if f{a) and f(b) have opposite signs, there must exist a root of f(x) 
m this interval This conclusion plays an important role m the approxima- 
tion of the roots, but a theory of approximation based on it alone would 
involve an enormous amount of numerical calculations A second impor- 
tant item is that 


f{r) — a 0 i a t x j + a» x" (1) 

is the Taylor expansion of the function represented by it in the neighbour- 
hood of x = 0 Hence 

/( 0) = /'( 0) = a„ , 1 (0) = a n (2) 

Thus one knows from (1) at the first sight much more about the behaviour 
of the function near x = 0 than m the neighbourhood of any other value 
It is therefore a matter of the greatest importance to have a scheme for a 
quick calculation of the Taylor expansion at any point x = q, say 

/(x) = a’ 0 + a\{x - q) + +a <u V 1 (* - g)”' 1 + a <n+1, n (x - g)». 
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The scheme which is used for this purpose is called Horner's scheme ; though 
it is very elementary, it discloses much about the function represented by 
f(x) The coefficients a' a\, , are continuous functions of q 

For instance a'„ = f(q), and a 11 ’ 1 ,, = a B is a constant number It is charac- 
teristic for the applications of Homer’s scheme, that not a' 0 alone, but 
the full set of the coefficients and their interconnections are considered 
Although Horner’s scheme is mostly applied to real numbers, it can also be 
used when the coefficients belong to any field (or even to an arbitrary com- 
mutative ring) K 

5-11 i Expansion of f(x) as a polynomial in x — q Let K be an arbi- 
trary field, e g the field of the real numbei s or of the complex numbers, 
q be an element of K and f(x) a polynomial of K[r] 

f(x) =: 2 a, *’ = (x - q) X «Y v"' 1 f a'„ -= (x - q) /,(*)+ a'„, 

U I 

a, = a', — q a' 1(1 , for t = 1, , n — 1, and 

a. „ = ftj i 

a'„ x = « n -x + qa'u 

— «„ + ?«' i 

Arrange the calculation of the coefficients a' as follows 

Un i On- 2 0- 1 

Qo'n-i qa'i qo\ 

O n O n-i (t n __> d A d 0 

n 

By the same method, f u {x) = 2 18 found out satisfying 

/iW = (*-?) /n(*) + «*, 

After n — 1 steps /(a;) is represented as a polynomial m x — q 

/(*) = o'o + o'i(* — ») + • + a“Vi(z — g)-' 1 + - g)“ 

The complete Horner’s scheme to get the Taylor expansion at x — q, looks 
like the following example 


where 


Hence 



204 


ALGEBRA X 


Example f(x) — x* — 15a -1 -f- 68a: 2 — 119a: + 67 

q — \ 1 -15 68 -119 67 

1 -14 54 -65 


1 -14 54 - 65 2 

1 -13 41 

1 —13 41 - 24 

1 -12 


1 -12 29 

1 


1 -11 

The lines of this scheme correspond to the consecutive steps Their signi- 
ficance is as follows 

f(x) — ( x 1 —14a- 2 + 54a- — 65)(a- — 1) + 2 

= (x J - 13a- + 41 )(a- - I) 2 - 24(x - 1) + 2 
=. (x - 12) (x - l) 1 + 29(r - l) 2 - 24(a - 1) -)- 2 
= (r - \y -ll(a- - l) 1 + 29(a- - 1 )- -24(a: - 1) + 2 

5-12 Approximate calculation of roots by Horner's scheme Hornei’s 
scheme is very useful for calculating the roots The method will be ex- 
plained by the help of the above example Put x — y -\- 1 . then 

/(•*•) = g(y) = y“ - Uy’ + 29 y'- — 24 y + 2, 

m = (7(0) = 2, 

/'(l) — ?(0) — — 24 

Thus f(x) has decreased from the value 67 at x — 0 to the value 2 at x — 1 , 
and it is still decreasing as is seen from the value of the derivative For 
this reason, one may expect that there is a root of f(x) near x = 1 In the 
neighbourhood of x — 1, the function can be represented approximately 
by its two lowest terms Applying the notation <— for approximation, one 
gets f(x) 24 y + 2 and therefore f(x) — 0 for y 0 1 Thus it is help- 

ful to represent g(y) by a polynomial my — 0 1 = i - 11 
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5 = 0 1, 1 -11 29 - 24 2 

+ 0 1 - 1 09 + 2 791 -2 1209 

1 -10 9 27 91 -21 209 —0 1209 

+ 0 1 - 1 08 + 2 683 

1 -10 8 26 83 -18 526 

+ 01 - 107 

1 -10 7 25 76 

+ 01 

1 -10 6 

As /(I 1) = — 0 1209 < 0, there is a root between 1 and 1 1 One 
approximates therefore J(x) by — 18 526 (x — 1 1) — 0 1209, hence 
r — 1 1 — 0 007 Thus, apply again Horner's scheme 

q ~ — 0 007, 

1 -10 6 25 76 -18 526 -0 1209 

- 0 007 +0 074249 - 0 180839743 + 0 130947878201 

1 -10 607 25 834249 -18 706839743 0 010047878201 

- 0 007 +0 074298 - 0 181359829 

1 -10 614 25 908547 -18 888199572 

- 0 007 +0 074347 

1 -10 621 25 982894 

- 0 007 

1 -10 628 

Hence the root is approximately equal to 1 093 The next approximation 
is q = 0 00053 By continuing in exactly the same manner, the calcula- 
tion would become very burdensome The number of the decimals to be 
considered increases at every step On the other hand, the influence of the 
higher terms decreases with q One can therefore omit those digits which 
arc not influencing the terms required in the final result It is convenient 
to state at the very beginning of the calculation, the eiror which is admissible 
In the present problem, one obtains an approximation of the root, correct 
up to seven figures of decimals by the following consideration 
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18 8881995723 = 0 010047878201 + 25 982894g 2 - 10 628g 3 + q* As 
q r- 5 3 10 4 , the two last terms will influence only the 9 th and the follow- 
ing decimals on the right side for 5 310 -4 < q < 5 4 10" 4 the quadratic 
term becomes 0 000007 

Hence 

q = 0 0005324, x = 1 0935324 

In this mannei the approximation of the root can be improved gradually 

> 

5-13 A modification of Horner's scheme To get a first approximation 
of the roots, it is often important to get a quick and simple review of the 
values of the function for different values of x For this purpose, the 
following modification of Horner's scheme is sometimes helpful 

Given ,?m<n> calculate by Homer's scheme 

fix) =6, +(*-$,) /,(*) fiq t ) = 6, 

fi{x) = b, + (x = - q.) f,(x) fiq,) = b, + {q,-q>)b, 

fm-Ax) = b m + (x - q m ) f m {x) 

fix) = 6, + bfx—q,) + bfx—q.) (x-q,)+ -j b m (x-q ,) (x-*q m . x ) 

+ (*-{?.) (* - 7m) fjx) 

To explain the method by an example, expand the polynomial considered 
in the previous example in this manner 

3= 1, 1 -15 68 -119 67 

1 -14 54 -65 

3= 2, 1 -14 54 - 65 2 

2 -24 60 

3= 3, 1 -12 30 -5 

3 -27 

1 -9 3 

fix) = 2 - 5(* - 1) + 3(x - 1)(* - 2) + (x - l)(x - 2)(x - 3)(ar - 9) 
/(0) = 67, /(1)=2, /(2)= -3, /(3)= -2, /(9)=2 + 8(-5+21)=130 
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This representation shows that for 2<1, /(x)>0, since each of the terms 
is >0, and that for x > 9, f(x) < 0, the 1st, the 4th and the sum of the two 
other terms being >9 So all roots are situated m the interval (1, 9) We 
have alrady calculated one root m the interval (1, 2) ; furthermore there 
is at least one root m the interval (3, 9) /( 8) = — 117 Hence there is a 

root in the interval (8, 9) The reader may calculate it by Horner’s 
scheme as an exercise 


5-14 Lagrange's method A different method for computing the roots 
will now be explained, and will be applied to the above example. If 

w , J n s I a lv 

f(x) — 5 a,, a- k = 0, then satisfies the condition v « n K l - . ) =0, and 

0 # 0 V X / 


to every root x m the interval (0, 1) there corresponds a value of * > 1 

x 

These considerations lead to the following method of approximation due 
to iMgrange 

If | is a root of g(x), a<£<a-i-l=a + * , g(r) = f(a — a) -= 

#, ( 1 -V Vi > 1 is a root of#,, and b s' < 6 - t 1, Vi =6 f — - Bv 

a a — x / Tjj 

repetition of this proceduie a representation of £ as a continued fraction is 

obtained By repeating the calculation, after n steps one gets the approxi- 


mation - 


P 

Q 


1 ) 

n 


with an error 


£ - 


P„ 

Qn 


< 


1 

Qn Qn., 


< 


1 

Qn“ 


'this method will be illustiated bv the example used beforehand It 
is known that x' — 1 or 1 -f G8a-’ - 1 1 9.t j 67 has a root m the interval 
(8, 9) Therefore we represent this polynomial by Homer’s method as 
a polynomial m x — 8 


1 

- 15 

68 

-lift 

67 


8 

-56 

96 

-184 

1 

- 7 

12 

-23' 

-117 


8 

8 

160 


1 

1 

20 

137 



8 

72 



I 

9 

92 




8 


1 


17 
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Hence 117 n\ - 137 V - 82 V - * 7 - 1 =» 0 

By mental arithmetic it is seen that for q— 2 the last coefficient is posi- 
tive, but for q=l it must be negative, so V lies m the interval (1, 2), and 
one has to arrange for the Horner expansion for q — 1 


117 

-137 

-92 

-17 

-1 


117 

-20 

-112 

-129 

1 17 

-20 

-112 

-129 

-130 


117 

97 

- 15 


117 

97'" 

— In 

-144 



117 

214 



117 

214 

199 




117 




117 

" 331 





In the same manner as zt has been done foi one proves that is 


situated in the interval (l, 

2) 




?= 1, 130 

144 

-199 

-331 

— 117 

130 

130 

274 

274 

75 

75 

-256 

— 256 
-373 

130 

130 

404 

404 

479 

479 

223 



130 

534 



130 

534 

1013 



130" 

130 

604 




?/, lies in the interval (2, 3) 




ll 

-223 

-1013 

-604 

-130 

373 

746 
523 ' 

1046 

33 

66 

“ -598 

-1196 

-1326 

373 

746 

1269 

2538 

2571 

5142 

4544 


373 

746 

2015 

4030 

6601 



373“ 

746 

2761 





q - 2 , 
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Probably, the reader has by now got the experience that q must be 
chosen in such a way that the sign of the last coefficient does not change, 
but that the procedure adopted for q + 1 would alter this sign ; q = 4 
will alter the sign of the second coefficient m the second mam row of Horner’s 
scheme, but the third coefficient will not change its sign, 6601 being too big 
Hence the following coefficients will increase and therefore will not be nega- 
tive However for q = 5, — 6601 is counterbalanced by more than 
3000, and therefore — 2761 by more than 12000, and so the sign of the 
last coefficient would be altered. Hence q = 4 


q = 4 1326 -4644 -6601 -2761 -373 

5304 3040 -14244 - 68020 

1326 760 -3561 -17005 -68393 

5304 24256 82780 


1326 6064 20695 65775 

5304 45472 


1326 11368 66167 

5304 

1326 16672 


At the next step, one gets q = 1 Hence £ = (8 1, 1, 2, 4, 1, . ) 


Hence | 


Pi= 8 

Q. = i 

P_, = 9 

Qi = 1 

i> 

ii 

Qi= 2 

II 

Q< = 5 

P, = 189 

Q, = 22 

P 0 = 232 

II 

O' 


Q,> 49 

232 

the error — 

27 

| is positive and S 27 4 <f~ ® 00075 


232 

As — — = 8 5925 . , the value of £ is correct up to the second decimal 

U i 

only ; the third decimal may be 2 or 1 From this example it .appears 
that Lagrange’s method is sometimes slower than the method of 5-12 By 
practical experience, one learns best to find out the most convenient com- 
bination of methods in any particular case 

69 O P.— 27 
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5-15. Kakeya’s theorem. In 5-11 to 5-14, Homer’s scheme has been 
used for calculating the real roots of equations with real coefficients But 
the scheme can be applied — as has been stated at the beginning of this 
section — for arbitrary fields It will be used now to find out a theorem on 
complex numbers. Let b 0 , b 1 + b 0 , . , b n + . + b 0 , be the coefficients 

of a polynomial Put q = a , then the first line of Homer’s scheme is the 
following one 

K b , + K b.> + f>i + b 0 b n -f- + . . + 6„ 

ab 0 ab t -f (a -f a 2 )6 0 ab,,^ -f . &o 

&i + b, + (1 + q)6j -f 6 n (l — a) + . + &o(l — 

(1 + a)b u (H« + a 2 )6« 1 — a 

n it 

Hence if a is a root, 2 b k = % b k a“* lk holds Let b k be positive numbers 
0 0 

and a complex, then 

S K S X K (1) 

0 l> 

If |«|< 1, then 2 b k < 2 b k | a j n • 1 k , hence (1) cannot be satisfied 
0 0 

m this case The absolute value of a root therefore cannot be smaller than 1 
If | a | = 1, then a = e 9 ' , and v 6 k a" +1_k = 2 K cos (x + 1 — k) Q 4 
* 2 b k sin (n + 1 — k) ^ vfc k , when the numbers b k are positive, unless 
8 — 2ktr, but ct = 1 cannot be a root, since no positive numbei satisfies an 
equation when all the coefficients are positive , hence | « | > 1 If therefore 
in a 0 y ” + a k y” 1 -1 + a n 

0 < ff 0 < a < a n , (2) 

then for every root « of this polynomial |a| > 1 holds Hence if 

y— 1, the roots /3 of %a k z k satisfy | /? [ < 1 This theorem is known 

X 

as Kakeya’s theorem 

Kakeya’s theorem The complex roots of 2 <h x * have all absolute 
values < I, if the coefficients satisfy (2) 

5-20 The roots of real polynomials. 

In this section 


A, bj Cf df e, 


( 1 ) 




kakbya’s theorem 21 1 

— with or without indices and dashes — denote real numbers, in the same 
manner 

<*, P, y, S (2) 

denote complex numbers, and ze denotes the conjugate of a. 

Hence a + 5 is real , a s is non -negative , a — s = c» 

/(•*•) = a u + a, x -f . + a„ , a: 0 1 + x n (3) 

n 

can be represented by f(x) — ]1 (x — a k ) (4) 

i 


5-21 Real and complex roots 

Theorem If a is a root of a polynomial f(x) with real coefficients, 5 
is also a root of it 

1st Proof Let K be the field of real numbers, i and — i be the roots of 
£ a + 1, then K(t) = K(— i) is the field of the complex numbers and there is 
an authomorphism J of this field interchanging i with — i and leaving the 
real numbers unaltered f(x) will not be altered by J, hence a will be 
transformed into a root of f(x), but a« a will be transformed into a, the 
theorem is true 

2nd Proof If a is real, 5 = a If a is not ieal, ( x — a) (x — 5) = g(x) 
is a real polynomial and irreducible m the field of the real numbers As 
f(x) and g{x) have a common root, these polynomials have a common factor 
of positive degree Hence f(x) is divisible by g(x) and a is therefore a root of 

}{x) 


Corollary 1 

f(x) - (x - c,) . (x - c T ) (x - a,) (x - Cf,) (z-cc k )(x- ar k ), * (1) 
where n — r + 2k 

Corollary 2 If every root of f(x) is counted as many times as its 
order of multiplicity m (1), the number of the real roots is = re (mod 2) 

Corollary 3 If n is odd, there exists at least one real root. 

A = n (a, — otj ) 2 = F(a v .. ,a a ) 

'<1 


( 2 ) 
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is called the discriminant of f(x) As F is a symmetnc polynomial in 

. , « n with integral coefficients, it follows (see p 144) that 

A =g{a u .. , a„), (3) 

where g is a polynomial with integral coefficients From (2) it follows that 
A = 0 if and only if a-i = aj for (4) 

Let the n roots a, be all different As has been proved in 3-35, p 146, 

a, 1 

A = 8 2 , 8 = .. | (5) 

I an"-' 1 1 

To get the conjugate of 8 , we have to interchange every number m (5) with 
its conjugate From (1) it follows that this operation means k interchanges 
of rows m the determinant (5) Now the product of two conjugate 
numbers is their “norm” N, which is a non negative real number (see 3-32, 
p 131) Thus 

(- l) k A = (~ l) k 8* = N{3) > 0 
Hence the following theorem holds 

Theorem Let f(x) of degree n have n different roots Then the dis- 
criminant of f(x) is positive (negative) when the number of pairs of conju- 
gate non-real roots is even (odd) 

Corollaries A real polynomial of degree 3 has three real roots if and 
only if the discriminant is positive 

A real polynomial of degree 4 with positive discriminant has either four 
different real roots or two pairs of conjugate complex roots 

Exercise Prove the preceding theorem without the help of (4) 

6-22 Changes of sign At the beginning of 5-1 , it has been mentioned 
already that if a <6, and the signs of f[a) and f(b) are different, then there 
is a root of f(x) in the interval (a, b) This statement which is fundamental 
for the calculation of the real roots, must be complemented by two other 
statements (well known from the elements of Analysis) which show the in- 
terconnection of the roots of a real function with those of its derivative 

1 If a < b, and f(a) = f(b), then there exists a root of f'(x) in the 

interval (a, b). 
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2 If f{x) = (x - a) k g(x), g{ot) ^ 0, k > 0, then 

\ 

f'(x) = (x — a)*- 1 g,(x), where g,(a)^= 0, since g t {x) — kg{x) + 
(x - a) g'(x) 

Therefore, if the roots of f(x) take m different real values a,, , a m , there 

exists at least one root of f'(x) in each of the m — 1 different intervals 
(a, , a ifl ) If a k is a multiple root with the multiphcity £ -f- 1, it can be 
considered as a set of q degenerate intervals, each of them containing ex- 
actly one root of f'(x) f(x) has the same sign at every point of an interval , 
in two consecutive intervals a,) and (a,, a it] ) the sign of f(x) is 

different, when a, is a simple root or a multiple root of an odd order , the 
sign is not different if a, is a multiple root of even order Thus if a k is a 
multiple root of order q -f- 1 of /(x), it is a multiple root of order q of f'(x) 

These properties hold for every analytic function with a finite number 
of roots, and are not special properties of polynomials If the coefficients of 
f(x) are all positive (all negative), /(x) is obviously positive (negative) for 
every positive value of x Hence f(x) has no positive roots when there is 
no change of the sign in the sequence of the coefficients of f(x) So we 
are led to study the connection between the existence of roots and the 
signs of the coefficients The experience got by using Homer’s scheme 
will be very helpful 

Expand f(x) as a polynomial m x—b 

/(*) =/b(* - b) = o„, n (x - by + + a„, o (1) 

Given f(x) and any real number b, then a set of n — 1 real numbers 
0 |„ n , , U|,,o is uniquely determined , of these, a,„„ — a n independent 

of b and 

«b,o = f(b) 

Hence a h ,„ = 0 if and only if 6 is a real root of /(x), and for 0 < k < n 

= l / k (&) ( 2 ) 

Consider now the changes of sign m the sequence 

a, 

JJll> a, 

>m-l» » ® l) >0 (3) 

To determme uniquely the number of these changes, it is necessary to give 
some sign to those coefficients which are equal to zero Whithout loss of 
generality, suppose that a„ ^ 0 , if any coefficient, or any set of consecu- 
tive coefficients which are equal to zero, follow in (3) immediately after a 
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positive (negative) coefficient, they will be considered to be positive (nega- 
tive) themselves By this rule, to every element of (3) a uniquely determined 
sign is allotted E g when f(x) = x s — 3a: 5 -f- 7r s -f- 6x 2 — 1 and 6 = 0, 
the sequence (3) is 

1 0 0 —3 0 7 6 0 —1 and the signs are * 
therefore # -{- + + — — + -)- + — • 

The number of changes of sign m (3) is not altered when from any set of con- 
secutive equal signs, the first only is considered, and the other ones are 
struck out , hence one gets the same number of changes if the zero- 
coefficients are simply struck out, but it is convenient to allot a uniquely 
determined sign to every element of the sequence (3) 

Given f(x), the number of changes will be considered now as a function 
C(b) of b Then 

0 g C{b) ^ n (4) 

If the first and the last element of the sequence (3) have the same sign, 
C(6) is an even number , if they have different signs, C(6) is odd Hence 
C(6) is even when a„ f(b) > 0, and it is odd when a n f(b) < 0 


5-221 Alterations of the first and the second kind To investigate 
C(b) as a function of 6, it is not necessary to consider the sequence of the 
coefficients a h , m (for m — n, ,1,0) themselves . it suffices to examine 
the sequence 

± + - ± ( 1 ) 

of the signs of these n + 1 coefficients A few general remarks on the 
alterations of any sequence (1) generated by transforming signs + into — , 
or conversely will be helpful , one can restrict the consideration to such aber- 
rations where the first sign remains unchanged, since the first coefficient of 
the polynomial is constant for all the values of b 

Every alteration of the sequence (1) which does not change the first 
sign can be generated by the help of at most n alterations done one after the 
other one, and each changing one sign only An alteration which changes 
the second, third, . . , (» -f- 1)’* sign either makes that sign equal to the 
preceding one — then it will be called an alteration of theirs* kmd, or it makes 
it different from the preceding sign — then it is of the second kind 
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Lemma An alteration of the first kind either diminishes the number 
of changes in (1) by one or two, or it leaves them unaltered , the diminu- 
tion by one takes place if and only if the last sign of (1) is altered 

Proof If the last sign of (1) is altered by an alteration of the first 
kind, then the last two signs were different before the alteration, and are 
made equal by it , hence one change is lost, whereas the other changes are 
preserved, and no new change is created If the sign to be altered is not 
the last one, then it is the middle of a triplet of signs in which the two first 
signs are different One has therefore to consider the following four cases 
(the arrows' denoting the alteration) 

+ — — -* 4 + — » — r + — » — — + 

+ — + — 1 ► + -t" T , — -f — —* — — — 

In the two first cases, the number of changes is unaltered, in the third and 
the fourth case, two changes are lost Hence the lemma 

Corollary 1 If an alteration A is composed of alterations of the 
first kind only, the number of changes either decreases or it remains unal- 
tered , if m particular the last sign is unaltered by A , the number of the 
lost changes is an even non-negative number 

Pwof As by no alteration of the first kind a change can be “gained” 
no change can be gamed by A Suppose now that the last sign is unaltered 
by A Of the alterations of the first kind composing A, let there be s al- 
terations losing one change, and t alterations losing two changes As 
those s alterations are the only ones where the last sign is altered, s is even, 
say s — 2k , the numbei of changes lost by A is equal to 2k -f %t 

Corollary 2 If A is composed of alterations of the first kind, and the 
r first signs of (1) are equal, then they remain equal when A is performed 
Conversely, if A ' is composed of alterations of the second kind and the r 
first signs are alternately -f and — , then they remain so when A' is per- 
formed 

Proof The statements hold obviously for alterations of the first (the 
second) kind, and therefore for alterations A (alterations A’) 

Exercises Show that an alteration composed of alterations of the 
first kind only can also be generated by composing alterations of the first 
and of the second kind 
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Prove that an alteration composed of one 01 more alterations of the 
first kind cannot be generated by composing alterations of the second 
kind only (and conversely) 

5-222 Monotony of C(b) The results of 5-221 will now be applied 
to investigate C(6) when b runs over the real axis Consider two different 
values of b, say b, and b 2 = b, -f <?, where q > 0 Let F(£) be any poly- 
nomial of degree n m the real variable £ with real coefficients Put 

£ — 6 = x, F(£) = f(x) = a u x n + a n . t x"-' -f . -f a 0 = <f,(x ~.q) 

= a n (x — q) a + - g)"' 1 + + a’ 0 (2) 

The coefficients of <f>(x — q) are obtained by Horner’s scheme, the different 
steps of the scheme being the following ones 



i, 


a*, 

«« 



a n-i> 


a'u 

«'o 


fliu 

u 

, a ” 2> 

a ] , 


(3) 

tf.I, 

a"' 

u ii- 1 » 

, a "' „ 

a j, 

n' u 



u n 1 » 

, a".. 

a"u 

O'u 



The alteration leading from the first row of (3) to the second can be per- 
formed by n consecutive steps, replacing successively 


a„ i by + q a n< 

^n -2 by a tl _2 - a„ 2 -f- q o. ll+ (, 

«„ by a'„ = a„ + q a\ 


(4') 


Since q > 0, the sign of a' n , can be different from the sign of a„ * only 
when it is equa] to the sign of a n Hence the first step of (4) either does not 
alter the sequence of the signs, or it generates an alteration of the first kind 
Similarly for the other steps , thus if the sequence of the signs in the second 
row of (3) is different from the sequence of the signs m the first row, the 
alteration is composed of alterations of the first kind The same state- 
ment holds for the alterations leading from any row of (3) to the following 
one Hence the alteration leading from the coefficients of f{x) to those of 
<p{t — q) is composed of alterations of the first kind , conversely the 
alteration leading from <j> to / is composed of alterations of the second kind, 
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Indeed if one interchanges / and o and replaces q by — q, then the first 
row of (3) is transformed into the second one by equations 


Q n t — (>1, 1 q 

«'« = a, 1 — q a', 


(4') 


Hence in this case, if the sign of , diffeis fiom that of a u } , it must 
be opposite to the sign of n„ From the coiollanes of 5-221 it follows there- 
fore 1 The function C{h) decreases when b increases 2 When at any 
step in a Corner's scheme for a positive 7, the first r coefficients are of the 
same sign, then the same holds for the final lesult of the scheme, and for 
expansions corresponding to all the higher values of b 3 When at any 
step in a Horner’s scheme for a negative 7, the first 1 coefficients have al- 
ternating signs, then *he same holds foi the final result of the scheme and 
foi the expansions eouesponding to all the lower values of b Suppose 
now that a n , , , have the same sign, say positive (without loss of 

generality) and 7 > 0, then it follow* from (4) that 


^ n 1 ^11 I) > " II 1 ’ I ^lin! 1 

n ri-1 — Cl ] I r 7 a n nt (*n-r -( q fln-1+1 

for a suitable q, therefoie a'„ r > 0 Hence one can find a value of b for 
which the r -j- 1 fust signs are equal, and by repetition of the procedure 
one reaches a value of b such that all the signs are equal for this value of b 
and for all the higher values For all these values of b, C(b) = 0 Suppose 
now that a„, , cr„_, have alternating signs and that — 7 < 0, then it 

follows from (4') that 

| a n-i | | a n i | > 1 |« ,1 r »i| > |®n-r+i | 

and the signs arc alternating Let a„-r*i > 0, then 

d n-r ~ tln-r q ® ii r ,j ^ »„- r 1 ? ^n-r-tl 

is negative for a suitably chosen value of q Similarly if a n . rtl < 0, then 
aVr can be made positive Thus one can find a negative value — q such 
that the first r + 1 coefficients are alternating, and corresponding to the 
case of equal signs and positive q, one finds that there exists a value of b 
such that the signs are alternating for that value of b and all the smaller 
values For all these values, C(b) — n Hence 

Theorem C(b) decreases steadily from n to 0 when b runs over the 
real axis. 
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5-223 Budan-Fountt’s theorem. The function G'(6) has been proved 
by the last theorem to be a steadily decreasing function of the real variable b 
As it takes integral values only, it is constant by segments , it changes its 
value at points of discontinuity only At these points it decreases the 
saltus being an integral number An alteration of C(b) occurs where a co- 
efficient of the polynomial changes its sign, and as these coefficients are con- 
tinuous functions of 6, a coefficient must take the value zero where it changes 
its sign If an odd loss of changes occurs, the last sign is altered (see 5-221, 
lemma), le the variable b passes through a root .of the polynomial The 
saltus of C(b) at a root of F(£) will now be investigated 

Let b be a root of .F(£), and let m be its multiplicity, m may be any 
positive integral number Again, put f — b - x, and F(|) -- f(r) Then 

/( 0) =/'((>)- - /-» "(0) -- 0 

The coefficients of/(r) are therefore 

«n> ,a,„. 0, , 0, where ^ 0 

Apply Hoinei’s scheme for a negative value — q . the second row will be 
«n, , (~ ?)"’ , 

thus the second ion has at least m changes moie than the first low u' ln is 
a continuous function of q , if therefore \q\ is small enough, a' m has the 
same sign as a m , and (—?)"' a' m = a' u = /(— q) has the same sign as a nl 
or a different sign, according as m is even or odd The coefficients of r/j(a~) = 
f(x -f q) show therefore m + 2k more changes of sign than those of f(x), 
where k is non-negative By passing through an m-fold root of F(£), 
the function C(b) decreases by m or an even positive number more than m 
In any interval (6,c) there exists a finite number (may be zero) of points of 
discontinuity of C , in the non-roots among them it changes by an even 
number, the alteration in any root is congruent (mod 2) to the multiplicity 
of the root ; hence the following theorem holds . 

Theorem 1 Let f(b) 0, /(c) ^ 0, 6 < c, and let r be the number of 
the roots of f(x) in the interval (6, c), every root being counted with its own 
multiplicity, then 

C(6)=C(c) + r + 2k, 
where lc > 0 is an integral number. 

Applying this theorem to an interval (0, c), where c is chosen so great 
that C(c)=0, one gets as a corollary 



BCD AN -FOUBUtE *3 THEOBEM 


219 


Descartes' rule The number of the positive roots of f(x) (every root 
being counted with its own multiplicity) is equal to the number of the 
changes of signs of the coefficients of f(x) or to a number less than it by 
an even number. 

The number of changes is not altered if one multiplies the coefficients 
of f(x) with positive factors, e g if one replaces the coefficient of x k by 
/ ,k ’(b) Furthermore, one may write the sequence in the reverse order ; 
in this case, the first element of the sequence may become zero, and the last 
element is constant , hence the elements equal to zero must be provided 
now with the sign of the next non-zero element on the nght side Using 
these notations, theorem 1 can be expressed as follows 

Bndan-Fourter’s theorem Let f(b) ^ 0, /(c) =/- 0, then the number of 
the roots of f(x) in the interval (b,c) (every root being counted with its own 
multiplicity) is equal to the difference of the numbers of changes of signs 
in the sets 

f(by,f(b), ,f'(b) 
and / (c), /'(c), ,/ ,n> (c), 

oi to a number less than it bv an even numbei 

Put 2 - and therein! e y — r h , then the positive roots of 

y + 1 r - > 1 

y(y ) - {ij -i 1)" f(x) aie m (1, 1) coirespondenee to the roots of f(x) m the 
interval (b, c) Theieforc the number of the roots in this interval can be 
found out by Descartes’ rule 

These formulas do not always give directly the exact number of the 
toots in an interval, but they are very useful for getting it even in more 
complicated cases 

5-2231 in example Consider again the example of 5-11 
f[x) = x* - 15a 1 + 68z 3 - 119* + 67 

As stated before, the real roots are positive and situated m the interval 
(1,9) One could get this result also by considering the changes of signs 
/( — x) has no change and therefore no positive root, i e f{x) has no negative 
root From the previous calculations for this example one gets by consider- 
ing the signs only 

C (0) = 0(1} * 4 
C (2) = 3 

C (8) = 1. 
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Two roots have been computed already, one m the interval (1, 2), another 
in the interval (8, 9) , to ejich of these roots, there corresponds a loss of 
one change We want to find out, whether the loss of the two changes in 
(2, 8) corresponds to roots of f(x) For this purpose, we try to approximate 
these suspected roots by Horner’s scheme and get by very simple calculations 

C (3) = 1 C(2 6) = 3 

C(2 65) = 1 C(2 64) = 3 

Hence the two roots can only be situated in the interval 2 64 < x <2 65, 
but we shall prove that f(x) is negative in this interval As stated pre- 
viously, 

f(x)= 2-5(*-l)+3(*-l) (x-2) f (r-1) (x-2) (*-3) (x-9) 

=2— (*— 1) (5-(x-2)[3+(3-*) (9-ar)]} 

Hence for 2 64<a-<2 65, /(ar)<2-l 64{5-0 65[3+0 36 6 36]} < -0 6612 

Hence f(x) has only the two roots calculated in 5-12 and 5-14 The 
same result can also be obtained by calculating the discriminant and ven- 
fying that it is negative 

5-23 Sturm’s theorem By Budan-Founer’s theoiem, the number 
of the roots in a given interval (h, c ) is not determined uniquely, since the 


difference of the number of changes in the sequences 

m, m, , ("'&) (i) 

and /(c), f(c), , /"“(c) (1') 

may be gi eater than the number of the roots by an even number Thus 
there arises the task of modifying the method used before by construct- 
ing a sequence of continuous functions 

/.(*). . /»(*) ( 2 ) 

with the property that the number of changes C'(b) in 

fiH . . U(b) (2') 


gives the exact number of the roots which are greater than b, or a number 
differing from it by a constant number only Then 


C'(b) - C'(c) 


( 3 ) 
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is equal to the number of the roots in the interval including its left 
(but not its right) endpoint The solution of the problem is due to 
J K Ft Sturm, it will be given in this article Though Sturm’s method is 
applicable to every case, it is not very convenient for practical calculation 

An m-fold root of f(x) is an (m — l)-fold root of f'(x) and therefore of the 
highest common factor h(x) = (/(a), f(x) ) Hence f(x) h(x) has the same 
roots as f{x), each root being a simple one As f(x) h(x) can be calculated 
by rational operations, one may replace f(x) by f(x) h(x) , thus there is no 
loss of generality m supposing that f(x) has simple roots only This supposi- 
tion will be made now The sequence (2) has to be arranged now in such 
a way that C’(x) decreases by one in every root of f(x), but is constant else- 
where 

Let 1 < A: < m , if f k (x) changes its sign at x=x 0 , and the sign of / k -i(z 0 ) 
is different from/ kt ,(a: 0 ), then no change is gained or lost by f v {x) changing 
its sign If however /^(ar „), fi.,(x 0 ) have the same sign, then two changes 
are either gamed or lost Now f , the sequence (2) is proposed to be con- 
structed in such a way that the number of the changes alters by one only, 
namely in the roots otf(x), and decreases for increasing x For this purpose 
a sequence will be constructed, where the number of changes is altered 
only when one of the outer elements changes its sign Say /, (a;) has the same 
sign as f(x) and / m (» ) has a constant sign These considerations lead to the 
following theorem 

Sturm's theorem Let (2) he a sequence of polynomials satisfying 
the following conditions 

1 j\(x) has the same sign as f(x), 

2 fte) • f{r), 

3 f m (x) has a constant sign, 

4 if 1 < k < m and fi(x 0 ) = 0, then / k _,(ar„) / ktl (*„) < 0 , 

let C'(x) be the number of the changes of sign in the sequence (2) for 
any particular value x, then the number of the roots of f(x) in the interval 
(6, c) — the left endpoint b not being included — is given by (3) 

Proof The number of changes can be altered only by passing through 
those points which are roots of the polynomials forming the sequence 
Let x 0 be such a point, and let k = k , , , k s be the indices of the func- 
tions for which x a is a root. From 3 it follows that k=£m. From 4 
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it follows that, for 1 < k < m, the triplet of polynomials f k . u /*, / k+1 
contributes exactly one change m an interval containing x 0 , but no other 
root of / k _ 1( / k , or / k+1 . Thus C'(x) is altered m the roots of f t (x) only, 
and its alteration is equal to the gain or loss of changes in the portion fi(x), 
f,(x) of the sequence (2) As there exist simple roots only, f^x) changes 
its sign at any root x u of f(x) If it changes from — to + , f(x) increases 
and therefore f'(x) > 0, and from 2 it follows that f 2 (x) > 0 , hence one 
change is lost Similarly if f(x) changes from -j- to —,/'(*) and therefore 
ffx) is negative and one change is lost Under the supposition made for 
the counting of the number of changes in 5-223, the zeros of /,(£„) have to 
be counted with the sign of f.,(xj Hence C'(b) — C'(c) gives the number of 
the roots of/(x) which are situated on the right of b but not on the right of c, 
that is in the interval ( b , c] which includes c and excludes b Hence the 
theorem 

A sequence (2) satisfying the conditions of Sturm’s theorem is called 
a chain of Stuim It. is possible to construct such a chain by the follow- 
ing rule 

/,<*) = /(•») 

fi(x) =/'M 

-/)(•*) = M*) ~ ?j(*) L(r), <> S degree / (x) < degree f,(x) 

- /imW = /mM - Qi( x ) LU)< 0 ^ degree f ln {x) < degiee /,(»•) 

If a is the degree of f(i), the procedure ends after not more than n steps , 
the last polynomial, say / m (x) is a common factor of the sequence, as /,(x) — 
f(x) and / 2 (,r) — f'(x) have no common factor of positive degree, f m (x) is 
a positive or negative constant Thus the conditions 1, 2 and 3 of Sturm's 
theorem aie satisfied Since (f)-i(x), ffx) ) = f m (x) is a constant, 
and ffx) have no common root Let for 0 < k < m, f k {x a ) = 0, then 
— fkn( x o) = /k-i('Co) ^ 0 Hence / kil (x 0 ) / k .,(x 0 ) = - / k -i(x 0 ) 2 < 0 Thus 
the sequence which is determined by (5) is a eham of Sturm Other 
chains can be obtained, e g by multiplying the polynomials with arbitrary 
positive constants 

5-231* Legendre’s polynomials Sturm’s theorem will now be applied 
to Legendre's polynomials Put 

p m(*) = Dm [(* 2 - 1)’"]. w — 0, 1, 2, . ; (1) 


* Ma y be omitted at a first reading. 
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D m denotes the m l " denvate of the function written m [ ], and D" is this 
function itself If w and v are polynomials in r, 

111 

v) — X ( )!>*-“ (u)I> (/’) (2) 

0 

l) m [(x- - 1)'“] = D" 1 1 D[(a 2 - l) n >] - D" 1 1 1 [x i - 1)"' 1 2mx], 
whence from (2) it follows foi vi > I 

]) m [(x 2 - 1)“J = 2»»rD'"' I l(T s 1 ) ,M 1 1 * 2m(m I ) 0‘ ,! 2 j (j J l)’" 1 ] (3) 

On thtj othei hand, one gets fiom (2) foi m > 1 

2D“l(* J - 1)“J - 2l)'"[(x- - 1)"' '(t- Dl ~2(t- - - 1)'” >J 

~t 4warD" , ' l |(A' - l) 11 ' ’] -{ 2 m(m - l)L) l,, -*|(.i- — l)" 1 '] (4) 
By Kubti acting (3) from (4) and applying (1) , one gets 

wPj.dr) = (^ J — 1 ) P'„, ,(x) ^ mx P,„_, ( r) (5) 

P,(r) - r, P 0 (r) =- 1, P'„(i) - 0, 

(5) holds also foi m — 1, and theiefoie generally 
From 

D u, * l f(a- - 1)'"] - D'"|2»u (,r J -- 1)"' ■] 

- 2n!.rD"'[(a; — l) m l ] t- 2 m- 1)'" >[(<- — l)"'- 1 ] 


follows F',„(x) = aP'.^ifr) -} iidV^r) ((3) 

From (5) and (6) eliminate P' mi , and obtain 

(x 2 - 1) P' m (x) = mx P m (x)-m P m -,(x) (7) 

After replacing m by m + 1 m (5), one gets 

(m -r 1) P m+ i(*) = (* 2 —1) P'm(*) + {m + l)xP m (x) (5') 
From (7) and (S') eliminate P' ro (x), then 

(m + l)P nm (x) = (2m + ljxP m (x) - mP^fx) (8) 

Consider the sequence 

Pn(*). Pn-i(*)> • ••> Pi(*)> Po(*) = 1 (9) 

in the interval 

— ■ 1 x ^ 1 . 
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From (7) it follows foi m = n, that if P„(x) = 0, P'„(x) has the same 
sign as P„ ,(x) If P n))1 (x) and P m (x) have a common root, this root must 
also be a toot of P m _,(r) as is seen from (8), and therefore of all subsequent 
polynomials of (9), in contradiction to P 0 (x) = 1 Hence for P,„(x) = 0, 
P„„](j:) 0, and therefore it follows from (8) that P mn (x) P m -i(x) < 0 

at every loot of P' m (x ) 

As P„(i) and P n i(x) have no common root, it follows from (7) that 
P„(x) has no common loot with its derivative and has therefore simple 
loots only Hence (9) is a chain of Stunn in the mteival -1^x241 
for every n 

Fiom (7) it follows that 

Pm(l) - Pn, ,(1) 

and 

P,u(-D = -Pm ,(-l) 

As P„(r) = 1, it follows that 

P.u(l) - 1 

and 

P,u(- 1) - (- 1)’" 

The number of changes of sign in (9) is theiefoie C'( — 1) = n, C'(l) = 0 
Hence Pula - ) has n different roots in the interval (— 1, 1) From (1) it 
follows that P„(x) is of degice n Hence the roots of Legendre’s polyno- 
mials are all situated m the interval (— 1, + 1) and are simple roots 

5-24 Method for rxilcvlation of root's To find out the roots of a polyno- 
mial, say 

f(x) = x° + ffln -1 'C n_1 + +a x x-f a„, (I) 

one can proceed in the following manner at first one has to determine an 
interval in which all the roots arc situated For this, put 

t — 1 + K-ij + + \ a t | + |®o|, (2) 

and suppose x > t, then 

x n >t x- 1 > lan-xl*"^ •• + |°il * + |o u |, 

and therefore 


0 < x” - ja^jx 01 - - jflqjx - \a 0 \ ^ f(x ), 
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Hence for x >t, f(x) > 0 Similarly for (— 1)° /(— x) ■ 

for x ^ — t, f{x) < 0, when n is odd 
and f(x) > 0, when n is even 

At any rate, if there exists any real root of f(x), it lies between — t and 
+ t. In general, it is not difficult to find out a smaller interval (a, b) in 
which all the roots of f(x) are situated , it suffices that the coefficients of the 
Taylor expansion of/(i) for x — a have alternating signs, whereas the co- 
efficients gf the Taylor expansion for x — b are all positive In this case 
C(a) = n, and C(b) = 0 , therefore there cannot be any alteration of the 
monotone decreasing function C outside the interval (a, b) 

If an mteival containing all the roots has been determined, one sub- 
divides the interval into smallet intervals By Sturm’s theorem one is 
able to decide how many roots are contained in each of these intervals , 
the intervals containing roots are subdivided again, and the method is re- 
peated Given a positive number f, to every root £ of f(x), there will 
be found out an interval of length <" e, after a finite number of steps, such 
that | is situated m that interval I e every root £ is determined up to 
an error < c, 

Though this method is absolutely sound in theory, a clever computer 
will hardly use it without essential modifications The application of 
Sturm’s method needs plenty of calculation, and one tries therefore to avoid 
it It is mostly not difficult to get an idea about the general behaviour 
of the function f(x) = y and the intervals where roots might he situated 
For this purpose, graphical methods are very helpful (e g Lill’s rectangular 
method *), provided the computor is sufficiently familiar with the theory 
and the practice of mathematical drawing The method explained here 
aims to “separate” the roots, i e to find out intervals containing one root 
each, and to narrow each interval till its length does not exceed the ad- 
missible error It is convenient to fix the error in advance , this is done 
mostly by asking that a certain number of decimals must be correct At 
every step of the calculation, one may neglect some digits, but one has to take 
care that the accumulated error must not influence the digits of the final 
result which are required In this book, only the methods of the calcula- 
tion can be explained, a skilful handling of them must be learnt by practice 


* See e g Bieberbach -Bauer, Vorlesungen uber Algebra p p. 134 — 140. 
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5-241 Linear interpolation If /(a) and f{b) have different signs, the 
graph of the function f(x) may be replaced by the straight line connecting 
the points with abscissae a and b This hne intersects the z-axis in 
x = [af(b) — bf(a) ] [ f{b ) — f(a) ] This value may be considered as a first 
approximation of the root Consider the example of 5-1 

Put 

x - 1 = y, f(x) = g(y) = y* — lly 3 + 29 y- - 24y + 2 
<7(0) = 2, 0(1) = - 3 

The approximation by the straight line leads to 0 4 which is obviously 
too great Of course the graph of the polynomial ill that interval is very 
different from a straight line Now 

(7(0) = 2 0(1) = - 3 

g'( 0) = - 24 g'( 1) - 5 

The giaph is therefore consideiably bent in the interval, and the root must 
lie neai the point x — 1 = 0 For the suitability of the approximation by 
linear interpolation, it is essential that the derivatc does not change its sign 
in the interval 

5-242 Neuiuris method The graph of y — f(x) can be appioximated 
by its tangent for an liiteival near the point of contact This means that 
m the Taylor expansion of the polynomial at the point of contact, the terms 
v inch are of the second and of higher degree are omitted When the interval 
is small, and the coefficient of the linear term is small m comparison to the 
coefficients of the highei terms, this appioximation furnishes good results 
Applied to the previous example, Newton’s method furnishes y — 1 12 

This appioximation was indeed the staitmg point foi the calculation of the 
same example by Horner’s method m 5-12 

If an approximation of a root is obtained which is already near to the 
root, then Horner’s method furnishes a Taylor expansion for this approxi- 
mation 


f(x) = a D x” + a D j 1 + +a x x - fa, 

m which, for small values of x, the higher terms are of a small influence 
(unless a, is small in comparison with the coefficients of the higher terms) 
Let x x be an approximation of a root of f(x) which lies near to 0, and put 

= -!>„ *!»+••• + «* z* 2 + a 0 ] . a u 
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then z 2 is in general a better approximation of that root By applying 
this method to the example considered just before, and putting x 2 = 1 12, 

one gets x s = 0 091 . . which is a little too small Indeed the value of the 
roots has been stated in 5-12 to be equal to 0 0935324 . . 

The methods explained here are very helpful if applied in connection 
with Horner’s scheme, especially when single roots are required When 
all the roots are asked for, it is often better to apply Graeffe’s method which 
will be explained in 5-3 


5-25 , Poulam's theorem 

The number of the real roots of a polynomial with real coefficients 
is interconnected with the corresponding number of a different polynomial 
by a theorem named after Poulam 

Poulam’s theorem Let h(z) = b m z M + . f-b 1 z -f b n be a polynomial 
with real coefficients and real roots only, let b m -■£ 0, b„ 0, and let f(x) be 
a polynomial with real coefficients, then the number of the different real 
roots of 


ffW = ft. /(*) + &i /’(*) -f + b m f ,ni> (x) 

is not less than that of f(x), and the corresponding proposition holds, when 
each root has been counted with its own multiplicity 

This theorem is a generalisation of the following lemma 

Lemma Lot a ^ 0, then the number of different real roots of af(x) + 
f’(x) is not less than the number of different real loots of f(x) , and the corres- 
ponding proposition holds, when each root has-been counted with its multi- 
plicity 

Proof of the lemma Let m, be the number of the different real roots 
of f(x), and let m, be the number of its real roots when each root is counted 
with its own multiplicity , also let m\ and mf denote the corresponding 
numbers for k(x) — af(x) + f(x) The proof will be given in several steps 
At first it will be proved that 

m\ > m x — I (1) 

and that 

m\ > m, - 1 , ( 2 ) 

then it will be shown 

m 'i ¥= - I (3) 
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and m' 2 ^=m 1 —\ (4) 

This will prove the lemma 

(1) Let Cj, c 2 , c mi be the different real roots of f(x) written in asoend- 
mg order, forming m 1 — 1 consecutive intervals It will be proved that k(x) 
changes its sign m each of these intervals The sign of f(x) in anyone of 
these intervals is constant , suppose (without loss of generality) it to be 
positive in (Cj, c 2 ) Hence f(x) increases in a sub-interval (c,, c') and de- 
creases m (c", c 2 ) If T\ is the multiplicity of the root c 1 of f(x), then f'(x) 
has a root of multiplicity r, — 1 m c, Since the number of roots of f'(x) 
is finite, one can suppose that f'(x) has no root inside the interval (c v c') t 
it is therefore positive in the interval and is either positive or zero at c,, 
according as r t = 1 or r r > 1 In the first case, it is obvious that k(x) has 
the same sign as f'(x) m a sub-interval (c,. d') of (c,, c'), where Cj < d' < c', 
and d' is suitably chosen, but even if r, > 1, the quotient af(x) f’(x) tends 
to 0 if a: tends to c 1( and therefore k(x) has the same sign as f'(x) in a suitably 
chosen interval (c, d') Thus k(x) is positive in (c,, d'), and in the same 
manner it is shown that k(x) is negative m an interval (d", c , ) Hence k(x) 
changes its sign in (c t , c 2 ), and there exists therefore a root of k(x) m this 
interval Correspondingly for each of the m, — 1 intervals Hence (I) 
holds 

(2) If f(x) has a root of multiplicity r i in c, (for 1 = 1, , wij), then f'(x) 
has a root of multiplicity r t — 1, and k(x) — af(x) + f'(x) has also a root of 
multiplicity r, — I in Cj , where a root of multiplicity 0 means that there is 
no root at that point By counting the roots of k(x) in c u , cm x with their 
multiplicity and considering that in eveiy interval (r,- n c,„) at least one 
root of k(x) is situated, one gets the inequality 

m, 

m\ = 2 (n — 1) -f (wij -- 1) = X r, — 1 = m, — 1 

i 

Hence (2) is proved 

(3 and 4). f(x) and k(x) are of the same degree, say n. As the number of 
the complex roots is even, m> = n = m\ (mod 2) This rules out m\ — 
m, — l , hence (4) holds Suppose now m'j = wr, — 1. As k{x) changes 
its sign in each interval (c 1 , c ul ), it has exactly one root in each of these 
intervals ; this root is of odd multiplicity and k(x) has no other roots 
Hence c,, ' , c mi are non-roots of k(x) and therefore simple roots of f(x). 
From these considerations it follows that 

m l = wi 2 = n (mod 2). 
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But, as the roots of k(x) are of odd order, m\ = m', (mod 2) 
Hence m x — 1 == rn\ = n (mod 2), and this consequence contradicts 
m l = n (mod 2) Hence the supposition m\ = m, — 1 leads to a con- 
tradiction, and therefore (3) holds As (1), (2), (3), (4) are proved, the 
lemma holds 


Proof of Povlain's theorem 

Without any loss of generality suppose that b m = 1 Hence for m — 1 , 
g(x) = b 0 f(x) f(x), and in this case the theorem is reduced to the lemma 
Let p > 0, and suppose the theorem holds for m — p We shall prove it 
for m = p + 1 If a is a root of h(z), then a^O, and h(z) = (z — a) hfz), 
where h,(z) has real roots only 

Let (z) = z p + + c, 2 + c 0 As the theorem holds for m ~ p, 

p 

the number of the roots of g,(x) — X c , /“’(*) 18 greater than or equal to 

0 

the number of the roots of f(x) In this formula / 10> (a:) means f(x) 


g\(x) = 2 c, / u+1> (x) 

0 

If one replaces the powers of z m h(z) = (z — a) h,(z) by the correspond- 
ing denvates of f(i), one gets 

g(x) = g\(x) - a gfx) 

From the preceding lemma it follows that the number of the roots of 
g(x) is not smaller than the number of the roots of gfx), and therefore not 
smaller than the number of the roots of f(x). Hence the theorem holds 


5-3. Graeffe’s Method 

By Graeffe’s method all the roots of a polynomial are calculated simulta- 
neously without any previous separation of them or any other preparatory 
measure The method leads very quickly to useful results when the roots 
are different and real It is often difficult to estimate the error made m neg- 
lecting higher decimals , thus one should check the results afterwards. 
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5-31 Real distinct roots Let b u b 2 , . b D be the roots of the poly, 

normal a 0 x n + a, x n ~ l + + a,„ and let 

IMX62O >|6„|, (i) 


then 


=6,(1 + l 1 + + = 6.(1 + ',) 

i» o, 0, 


d o 


»» f* w 

^ 6 ^( 1+ 6, + + &, + x~ + + i >r + M.- + 

6,(1 + *,)'" 

= 6,(1 + F,) 


6n-i 6 n 
6, 6 2 


— 6„ ,(1 + F„-j) 


^11 ^ / ®, ®J ^11,-1 

On-i a (1 \ a 0 a l a ti - 2 ' 

6.i(l + f n) 


6,_ _b„ 

6,(1 + f i) 6n-i(l + F,i_,) 

If | b, b, +1 | is very great, for 1 = 1, , re, the numbers e, can be 

omitted In this case, one gets the approximation 


b, — — — , for 1 — 1, , n (2) 

In general (2) is not a consequence of (1), but for a suitable exponent 
m the quotients b„, nl b, m become negligible Therefore, one has to 
find out a polynomial, whose roots are b,"', , b„ ln The coefficients of 

this polynomial are symmetric functions of b,, , b n , hence it is possible 

to calculate them as rational functions of a„, a , , , a„ with rational co- 

efficients The calculation for an arbitrary m is tiresome, but it is easy to 
find out a polynomial whose roots are the squares of the roots of f(x), and by 
repeating this construction one gets polynomials with the roots 
h 2 h 4 h 8 h 2 k 

°\ i °i ) u l 1 i 

Let a' u x’' + a', a; 0 ' 1 + a',„ o' 0 = a 0 m have the roots b, 1 ", , b u m , 

and let m be chosen so great that b,.,™ 6,"' may be neglected, then 

b, m — — holds 
a n. 

The corresponding holds for the polynomial with the roots b, 2 ®, . , 
bn 4 "' , hence the absolute values of its coefficients become approximately 
the squares of the corresponding coefficients a\. Henoe one has to repeat 
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the construction of polynomials till the calculation shows that after further 
repetition, the coefficients will be practically the squares of the coefficients 
of the preceding polynomial To get a polynomial whose roots are the 
squares of the roots of f(x), calculate 

(- 1)" f(x) /(- r) = a v (x - 6j) (x - b n ) a 0 (x + b L ) (x + 6J 

— «0 2 ( x2 — V) ( x *’ — V) 

= IM 1 ) 


The coefficients of f, will be calculated by the following scheme 


'(1) 


a. 

a. 

«u 



- a, 

a 2 

i a n 

(2) 

< 

- a, 2 

«r 

± «ti 2 



+ 2a, , a. 

-2a, a. 





I 2a„ a, 


As in the first pan of lines corresponding numbers ditier only by the sign, 
one uses to write only the signs in the second line The numbers increase 
very quickly , therefore it is convenient to omit the last figuies, simulta- 
neously denoting the decimals very clearly For this purpose replace the 
decimal point by an index which is equal to the exponent of the power 
of 10 with which the decimal fraction is to be multiplied E g 


3"45613l fcH 

3 456131 

10” 

To extract the roots at the end of the calculation, we need logarithms It 
is therefore useless to calculate more decimals than the tables of logarithms 
contain 

Example 

X ' — 

10P + lfix — 2 

= 0 


(1) 

1 

- l'O 

l'O 

— 2 


+ 

+ 

1- 

+ 


1 

- P0 

2 2 56 

- 4 



4- 0 32 

- 0 4 


(2) 

1 

- 6'8 

2 2 16 

- 4 


+ 

+ 

+ 

+ 


1 

- 4 9 624 

4*6656 

-16 



+ 0 432 

- 0 0544 
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<4) I 

+ 

- 4’192 

+ 

4*6112 

+ 

- 16 

+ 

1 

- 1 7 75729 

2 t T2631 

- 256 


+ 0 00932 

-0 00013 


(8) 1 

- 1 7 74797 

2° 1261 8 

- 256 

At the next step the coefficents will be the squares of the preceding co- 

efficients, and in no case the error will have influence 

on the first 5 figures 

Therefore stop the procedure, and calculate now the roots by the help of 

logarithms 




log of |coeffs | 

log X s 

log |x| 

1*1 

0 

7 24254 

0 90532 

8 0412 

7 24254 

2 08506 

0 26063 

1 8223 

9 32760 

1 08064-8 

0 13508-1 

0 1365 

2 40224 






0 30103 

10 0000 



= log 2 



The sign of the roots cannot be determined by Graeffe’s method , one must 
make a special investigation for the sign in every case In this example 
the coefficients have alternating signs, hence the roots are all positive 
Verification form the elementary symmetric functions of the approximate 
loots and compare 

*, = 10, «, = 15 9998 a, = 2 

for 10 16 2 


5-32 Complex roots If a real polynomial has complex roots, two of 
them are always conjugate, and these have therefore the same absolute 
value Thus Graeffe’s method has to be modified m this ease An 
example will give valuable hints for necessary modifications 

Example 


x' - 11*’ + 29x - 24x + 2 
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This polynomial has already been considered in § 1, and it is known to have 
two real and two complex roots Apply now the scheme* of Graeffe’s method 



1 

- 11 

29 

- 24 

2 

(1) 

1 

- 121 

58 

841 

- 528 

+ 4 

- 576 

+ 116 

4 

(2) 

1 

- 63 

317 

- 460 

4 


, 1 

-3969 

+ 634 

100489 

- 57960 

S 

-211600 

+ 2536 

16 

(4) 

1 

- 3335 

42537 

-209064 

16 


1 

-P11222 

+ 851 

1 “80938 

-1 41525 

0 

— 4’°3707 

0 

256 

(8) 

1 

-P10371 

3*941 3 

— 4 ,0 3707 

256 


If the procedure is repeated, the two first and the two last coefficients will 
become the squares of the corresponding coefficients of the line (8), but 
the third coefficient will depend also upon the second and the fourth We 
cannot expect that further repetition of the procedure will make the third 
coefficient independent of its neighbours, as two roots of the polynomial 
have an equal absolute value If 6 t is greater than the absolute value of 
the complex roots, then 


— a !-= V 1 (l + -■* - f ‘ _ ~ bi'", for a suitable m 

a 0 V V / 


A rough mental calculation shows that b L 2 60, r- 1 7 2 


1 CL* 

The same consideration for /(— ) shows that, if b t m < 1 6 J | — r *~ 6 4 m 

x a 3 

holds Hence the complex roots are only dependent on the 3 middle 
coefficients In order to get the law of dependence, the considerations 
will now be generalised 


* As the signs in the second line are all -f, we omit these lines for abbreviation. 


/m n 


on 
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Let B and C be two intervals, so that every number of C is very Bmall m 
comparison to the numbers of B, and let 

f(x) = «„*“+. . + o n -! x + a B 

have two sets of roots 

6j, b,, , b T whose absolute values belong to B, and 

Ci, c Jt . , c, whose absolute values belong to C, where r s — n. 

Let y be a number of B, and d a number of C , represent the coefficients 
of f(x) by its roots and approximate these by y and d respectively As 
d is small m comparison to y, 

«k — (— l) k ( l )«o y k ’ for k = r ’ an(i «m~(— 1)'"( «U y r d for m%r. 
Hence 

= a„ y' + + «, + y l + + a, y R 

— a v y' + . . + a, = f t (y) 

Let y be one of the roots b i, then /,(y) ~ 0 Hence the roots 6, can be 
approximated by the roots of fi(x) 


Let x — 1 2 , then a u z" + + a, 2 -f- a u has the roots 


1 

Ci 


= h\, 


= b\ 


and - — c' 

&i 


the absolute values of b\ belong to an interval B', the absolute values of c\ 
belong to C', and eveiy number of C' is small in comparison to those of B' 
Hence the loots 6'j can be approximated by the roots of a n 2” + + a r , 

and therefore the roots c, of f(x) can be approximated by the roots of 

a r x’ + a r+i x s 1 + + a„ 

/ 

Thus the polynomial /(x) has to be split into two polynomials, the first 
is defined by the r + 1 upper terms and'leads to the upper class of roots , 
the second one is defined by the s -f 1 lower terms and leads to the lower 
class of roots The two classes may also be divided mto sub-classes etc. 
Finally we get classes 

him • hi, rj , h a> 1 , . . hj, i-j ; . .. , hj,,i, . . ., hj,, r k , 
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each root being small in comparison with the roots of the preceding classes, 
and to each class corresponds a polynomial, which can be cut out from f(x) 
The ratio of the absolute value of the roots increases when we replace these 
roots by higher powers of them, therefore we get finally by Graeffe’s method 
k polynomials each of them having only roots with the same absolute value 
In the previous example these polynomials are 

x - F10371, V 10371 x 2 - 3*9413 x + 4 lo 3707, 4'"3707 x - 256 

From these polynomials one gets the roots of f(x) 

8 log 16, | - 7 04286 log j6, j = 0 88036 |6j | = 7 592 

8 log | b t | = 7 76769 - 16 log | b t | - 0 97096 -2 | b t | = 0 093532 

16 log 1 5, | = log 4'°3707 - log 1*10371 

= 3 59767 log |5, | = 0 22476 J4>,| = 1 6779 

log cos 8<p — log 3*9413 — 8 log \b„ j — log 2 — log l 7 10371 = 9 35293 — 10 
8>t> = + 73°31' + *360° 

</. = ± 9°n'2:r + ms 0 

To finish this calculation, one must fix the signs of the real roots and 
determine the integral number k As the signs of the coefficients are alter- 
nating, there is no negative root Hence 6, == 7 592 , b t = 0 093532 . 

These numbers correspond to the iesults obtained m 5-1 by Homer’s method 
and by Lagrange’s method 

As b t + 5; + b 4 4- bt =11, 2r cos <p =3 315 . But as 2r=3 356 , 

<ji must be a very small angle Hence lc = 0 

b 2 — 1 6567 -f t 0 26803 
b a = 1 6567 - i 0 26803 

Verification • log b l + log + 2 log r = 0.30104 
for log 2 = 0 30103 

bt -|- 4-hg d - bt = 10 999 

for 11. 

.The result can be corrected by further calculation. As seen from the 
results of 5-1 and from the checking given here, b lf b t and r are very exact. 
The correction is therefore expected to concern mainly the angle <p, whose 
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true value may be a little smaller As <p itself is a small angle, this correction 
will materially affect sin <p Hence the imaginary parts of b 2 and b 3 are 
true up to the second decimal only 

If a polynomial with roots of equal absolute value has a degree > 2, 
either it has multiple roots, or it has non-conjugate roots with the same 
absolute value The multiple roots can be removed by division by the 
h c f of the polynomial and its derivative Non-conjugate roots of equal 
absolute value can be cleared away by Horner’s scheme, viz , if |x| = |x'| 
and x' is different from x and x, then ] x — a | ^ | x' — a \ 

Hence the real and the complex roots of f{x) can be found out in every 
case by a combination of Graeffe’s method and Horner’s scheme. The 
results should be verified and it is possible to minimise the error by the 
methods given in 5-1 

5- 4 Roots of complex polynomials 

Let <ji(x) be a polynomial with complex coefficients, 

<l>{x) = s” + fln-i a;" 1 + + a, x + a 0 

<P(%) = x" + d n _, x"- 1 + + «, x + d„ 

(?>(*), 7 (*)) = /i(*)> ?(*) =/iW <h(x), 1>(x) =M*y-<h( x ) 

<h(x) fi( x ) 

then ffx) and f 2 ( x) are real polynomials, and every root of <p(x) is either a 
root of /j(x) or of / 2 (x) Of two conjugate roots of / 2 (x), one only is a root 
of <l> (x), the other one is a root of <p(x) Thus by applying Graeffe’s method 
to fi(x) and to / 2 (x), and eventually verifying, it is always possible to find 
out the roots of 

6- 41 Circles enclosing the roots of f(x) As m 5-24, put 

t = 1 + j ®n-i | + • • • + | a i | + | % | • 

It will be shown that the roots of <j>(x) are situated inside a circle of radius 
t about the origin For this, it is necessary and sufficient to prove that 
|^(x)| is positive for jx|g t The proof is nearly the same as in 5-24. 

jxj“ > <|x(“- 1 g|a n . I ||xj” i + ... +la I flxl + la 0 l^la n _ 1 +,... -fa 0 |; 

hence 0 <\x“| — x" -1 + ...-+■ a| g\<j>(*)|. 
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By the following consideration, it is often possible to find out a smaller cir- 
cular domain with a centre different from the origin such that all the roots 
of (fi(x) are contained in it Without loss of generality, one may suppose 
that the coefficients of <p(x) are all real (as every root of <j>(x) is a root of 
f t (x) fi(x)) By a suitable transformation, x = x' + a, one obtains a polyno- 
mial f(x') = fj>(x) with positive coefficients only To a circular domam 
| x' 1 b, there corresponds a circular domain in the ar-plane which has the 
centre a and the radius b Thus one can apply the following generalisation 
of Kakeya’s theorem to find out that the roots of <f>(r) are situated between 
two concentric circles 

I 

Theorem Let the coefficients of f(x) = a 0 + a, x -f- + a n x n be 

positive and 0 < p < - tt|v - < a, for k = 1, , n, then the roots of fix) 
satisfy the condition 

P < |*|<? 

Proof Let x — q y, f[x) = g(y) =$b k y\ then b w = q y a K Hence 
6 k _i b h < 1 From Kakeya’s theorem it follows therefore for the roots 
that |y| < 1, and |z| < q The roots of F(z) = a 0 z" + + a„ x z f a n 

are reciprocal to the roots of f(x) As — - - < ' holds, it follows from 

a .i-k P 

the first part of the proof that the roots of F must satisfy | z | < - 
Hence jx| = | — | > p for the roots of f(x), and therefore the theorem holds 


5-42 Interconnection between the roots of a polynomial and those of its 
derivative 


Theorem of Oauss Every convex polygon enclosing all the roots of 
<p(x) contains every root of <p'(x) 

Proof Let y be an arbitrary root of <p' and /?„ be the roots 

of <p. Without any loss of generality, one may suppose that y is not a 
root of <p Then 


= V. ? • 

<p(x) xL* x - j8 t ’ 


hence 0 = 


<pfy) = y 1 

<p(y) ^ y — Pi 
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and therefore 


0 = 2 


1 

y - Pi 


y — P± 
\y- Pi \ 2 


2(y - Pi)b h 


where every b t is positive In the complex plane, the numbers /?, — y re- 
present vectors which start from y and lead to ft u , P„ Project the 
vectors b l (/3 l — y) orthogonally on any straight line of the complex plane , 
then the sum of those ‘ components” must be zero, as the sum of the vec- 
tors themselves is zero Consider m particular two straight lines gr, and g, 
intersecting in y oitliogonally If all the points y8, are situated on the same 
side of g it then the components of — y on g, have all the same sign, 
and the same holds for the components of b t (/3 t — y) as the numbers 6, are 
positive This is impossible, as the sum of the components of those vec- 
tors along g 2 is equal to zero Hence there exists no line y, passing through 
any root of <p'(x) such that the roots /9, of <!>{x) he all on the same side of 
Let now P be a convex polygon including all the roots of y If y is outside 
of P, we can draw through y a straight line g not intersecting P Hence P, 
and therefore all the roots of <j>, are situated on the same side of g Hence 
y is not a root of <// Hence the theorem 


Let P 0 be the smallest convex polygon including the roots of <p (the 
reader may prove that such a polygon exists and is unique ), Pj the 
corresponding polygon defined by <p . , P, the smallest polygon contain- 
ing the roots of < /> ,l> The polygons with higher indices are mcluded m the 

preceding ones. 0 <u) degenerates to the point — — = L. 5 \B h This 

Tt flu 71 

point is the centre of gravity of the roots of <p, and for the same reason it is 
the centre of gravity of the roots of <p’ and of the roots of each derivate. 


5-5 Intel polatwn 


Let 


Pi> • 1 P n,i 


be n + 1 different elements of an aibitrary field K, and let 

b-i, . , An tl 


be n + 1 arbitrary elements of K. Wanted a polynomial /(*) 0 f Khl 
that fifii) = At for i = 1, . n + 1, and degree f(x) % n. 


so 
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Let f(x) = <*„+.. -f a n x n . This polynomial has the proposed pro- 
perties if and only if its coefficients satisfy 

n 

2 <*i Pk A k 
0 

The determinant of this system of n + 1 linear equations (see 3-53) is equal 

to ± H (Pi — Pi) and is 0 , as the n -f- 1 elements /?, are supposed to 
><J 

be different Hence the problem has one and only one solution This 
solution can be calculated by the methods explained m Ch I, but it is 
easier to get it from special cases, 


5-51 Lagrange's formula Let f k (x) be the solution jf A, — 0, 

n+j 

A h = 7, then f(x) = H k f\n(x) is the solution for arbitrary A-elements 
i 

But f k (x ) = , where g(x) — II (x — p t ), satisfies the above 

l* ~ Pk) 9 \P k) 1 

requirements So we get Lagrange’s foimula for interpolation 


/(*) =]? 9 (*) 


(*- p w ) g’(Pu) 


5-52 Interpolation by successive calculation By Lagiange's ionnula 
the problem of interpolation has been solved in the most complete and general 
manner, but the formula is not convenient for practical calculation It 
is easier to calculate the coefficients of the pioduct-representation of f(x) 


f(x)=y v +y,(x-p l )-\ y Ax- Pi) (x-p,)f- +y n (x -p,) (x-/3„), 


where y u =/(/?i) = A,, 



and one may calculate the coefficients 


y, successively When K is the field of the real numbeis, it is convenient 
to arrange the calculation m the following manner 


Lot ffx) be defined by f u (x) = f(x), and for k = 1, ,n, 

f j r ) _ fk- i[ x ) ~ Jk-)iPk) 

then ffx) = yt + ym(x — £m) + + ( i - - fi") 

Hence f u (x) = y„. Therefore calculate the values 
/*. __ t.ia \ fnr k s 0. . . ,»> fc < m g n + 1 
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by { k , m} = [{£ — 1 ,m) — {k — 1, k }] {fi m — fi k ) 


and {0, m} = X ra . We calculate the values column- wise in the following 
scheme 

{0, m} {1, m } {2, m } .. {n, m } 


{k, 1} 
{*, 2} 

{*. 3} 


Ai 



K a _-_ X, (1, 3} - {1, 2} 
J P>-Pt 


{k, n + 1} 


^n*i 


Ami ”Xj 
fin* 1 fit 


{1, » + !)-{!, 2) 

fit i+i fi i 


{n — 1, n + 1 ) — ( n— 1 , n) 

fit l+l fin 


The first elements of the different columns of this scheme form the set 
y u , Yi, > Yu of the coefficients of (1) This scheme is easier for calcula- 
tion than Lagrange’s formula 


5-53 Newton’s formula The calculation can be further simplified 
if the elements /?,, , fi„*, are equidistant, i e if 

P\**i fi\* — A 


for every k , then 

A {&, m} = [{k— 1, m) — {k—\, £}J ( m—k ) 


n -1 

V 


Ak-i 


where A k 1, i-fl} — {& — l,i}is the difference of two consecu- 

tive elements in the preceding column So A {k, m} is the mean-value 
of the differences of consecutive elements of the rows m to k in the column 
k — 1 Now m the cases undei consideration, the scheme will be trans- 
formed in such a way that the differences of consecutive elements only will 
be calculated. For this purpose, the notations of the calculus of differences 
will be introduced here. 
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Let A x m = x — (0J, then A x (w — k ^ 1) — x — 

Let F(m) = /(x) =/{A i w+ £,) = y 0 + yi «A X + y 2 m(m — 1)A 2 

+ + Vn “(« — 1) • {u — n — 1)a;, 

and let A /(*) = /(x -f AJ — f(x) — F(w -j- 1) — F(w), then 

A f{x) = A,[y, + 2y 2 uA x + . + »y„ u(u - 1) (u - n - 2) A,- 1 ] 
viz , (« -f 1) u(u — 1) (u — k + 1) — u(u — 1) (u — k) 

■ = (k + 1) u(u - 1) .. (a — k + 1) 

Let A(A f(x)) = A 2 /(x), , A(A r f{x) — A" 1 f(x), then by repetition of 

this procedure 

A 2 /(x) =A“[2y 2 + 2 3y J MA Jt + +n(n— l)y„tt(tt-l) (u-n-3)A/ 2 ] 

A"/(x) =A" n > y° n 


For abbreviation 

Hence 

furthermore 


A k /(jS,) = Af 

A 1 ;* 1 = A^, - At 

f(Pi) = Vo 

A, = A x y, 

A'i = A x k ' y h 

A" = A" n 1 y„ (for i = 1, 2, ) holds 


Hence Newton’s formula 

/(*)=/(£,)+ A} M+i ( Af w(w-l)+ + re 1 ) A?M(u-l) . (u-n+1) 

The elements Af can be calculated very easily by the following scheme • 


A, 

a 2 

A, 

An 


A 

Ai 


Af 


A? 


a; , 


A 


l 

n-1 


The degrees of/(x), A /(as), . , A” f(x) are decreasing, and the last one is 
a constant , so we can use the above scheme also for extrapolation to get the 
value of f{x) for every arbitrary integral value of u, that means for every 
value x=/L + &Aj, where k is an arbitrary integral number 
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MATRICES 


The problem of solving a system of linear equati ons leads to the notion 
of matrix (see Chapter I) , moreover it has been shown (see 1-11) that any 
linear transformation of a vectorspace is completely determined by a matrix 
As the linear transformations are of the greatest importance for many 
branches of mathematics, it has been necessary to develop a theory of 
matrices of which some basic portions will be explained m this chapter 
The notion of matrix has been generalised, the theory has been extended 
far beyond the modest aims of this book, and is now applied nearly every- 
where in Mathematics 


In Chapter I, a matrix has been defined as a rectangular scheme of 
mn numbers ordered in n row's and m columns It has been showm later 
on (see 2-61) that the results of Chapter I, excluding 1-7, are not affected 
if in place of “numbers”, elements of an arbitrary field K are arranged m a 
rectangular scheme to form a matrix This generalisation will be used here 
If the elements of a matrix A are elements of a field K, then A is said to be 
a matrix over the field K If A is a matrix over K, it is obviously also a 
matrix over every extension of K It is sometimes necessary to extend 
the field over w'hich the matrices are supposed to be situated On the 
other hand it is often necessary to restrict the considerations to matrices 
the elements of which belong to a certain integral domain A in K , these 
matrices are called matrices over A Thus a matrix over A is simultaneously 
a matrix over K, whereas not every matrix over K is a matrix over A 
With a few exceptions, the matrices under consideration m this chapter will 
be square shaped matrices, t e the number of rows is the same as the number 
of columns , this number is said to be the degree of the matrix Thus 


A 


( «) ) 


/ a'i 

\ a "i 



(I) 


is a matrix* of degree n The notation (1) is very convenient for general 
investigations. Throughout this chapter, the elements of a matrix of 
any degree, denoted by capital letters A.B,C, , will be expressed by the 
corresponding small type with upper and low'er indices as in (1), unless 
the elements are expressly given in a different form 

• Instead of brackets, sometimes pairs of vertical double bars are used to denote 

a matrix. 
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6-1 Addition and multiplication of matrices of degree n. 

Let K be a field, 0 its zero, and 1 its umtelement ; let A be an integral 
domain in K (which may be identical with K), A, B, C, . be matrices over 
A of degree n, and let in particular 0 be the matrix whose elements are all 
equal to zero Using the notation explained just before, one defines addition 
and subtraction by the following formulas 

A -f- B = S, when a\ -f- b\ = (1) 

A — B = D, when a' k — b\ = (2) 

for i, k — 1 , , n 

From these definitions follow directly 

Af B = B -f A commutative law, (3) 

(A + B) C — A + (B + C) associative law, (4) 

A-f-0 = 0 + A — A, 
and denoting 0 — B = - B, 

A b (~ B) = A - B 

Thus addition and subti action are inverse operations The matrix — B 
has the elements — b' u , and the set of all the matrices over A of degree n 
forms a module of which O is the zero-element. The elements of this module 
can be multiplied with the elements of A by the help of the following de- 
finition 

c A = ( (c a' u ) ) (5) 

Thus .to multiply the matrix A with an element c of A, one has to multiply 
every element of A with c From (5), the following formulas are imme- 
diate consequences 

(c + d ) A = c A -f dA 

distributive laws (6) 

c(A -f B) = c A + c B 

0A = O, iA = A, 2A = A + A, mA = A+ + A (7) 

In (7), m denotes an element of the primefield of K which is obtained by 
repeated addition of 1 to itself (see 2-25 ). All these elements belong to A 
as A is an integral domain ; on the other hand, all these elements m belong 
to the primefield of K (no restriction has been made for the characteristic 
of K). Denote by 


( 8 ) 
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the matrix of degree n, the elements of which are each equal to 0, except the 
common element of the r th row and the s th column which is supposed 
to be equal to 1 

Then 


A = Sa'n E lk (9) 

The representation (9) of a matrix A is unique The left side of (9) is O 
if and only if the n 1 elements a\ are equal to zero each Hence the n 2 
matrices (8) are mdependent over A, and they therefore form a basis Let 
A be a field (A = K), then the matrices form a vectorspace over K (see 
2-61) Hence 


Theorem 1 The matrices over K of degree n form a vectorspace of 
rank n 2 over K The matrices (8) form a basis of that vectorspace. 


The multiplication of matrices of degree n has been defined already 
m l-(ll) 1 by 

AB = Q (10) 

if g\ y = % a'j b\. That the associative law 

j 

(AB)C = A(BC) (11) 

holds for this multiplication, has been proved already in 1-(11)1, (3) 
That the commutative law is not satisfied by the multiplication (10), is seen 
e g from the example of the two matiices 

(i ?) - (o ?) 

which are obviously non-commutative From (1) and (10) follow easily 
the distributive laws 


A(B + C) = AB + AC 
(A + B)C = AC + BC 
Using the notation introduced in 2-16, one gets 


( 12 ) 


Theorem 2. The matrices over A of degree n form a ring A, n). 

The zero-element of R(a, n) is 0, its uniteiement E is the diagonal, 
matrix (see p.46), with diagonal-elements equal to 1 R(A, n), contains 
a subnng a' which consists of those diagonal-matrices in which all the 
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diagonal-elements are equal. By mapping each of these matrices on its 
diagonal-element, one gets an isomorphism between a' and the integral 
domain A The elements of A' are commutative with every matrix A 
of R( A, n). The multiplication (5) of A with an element c of A can be 
replaced by the multiplication of A with the matrix of a' which corresponds 
to c The matrices of A' are sometimes called “gcalars”, but this notation 
will not be used here Let A be a matrix of i?(A, n), then one can form 
“powers” E = A 0 , A = A 1 , AA = A-’, . , A k A = A ktl , and it follows 
from the associative law that A r A" = A 1 ** — A' A r holds Hence 

E, A, A*, , A k (13) 

form a system of matrices which are commutative one to another 

Denote by A and B (with or without indices), matnces of R( A, n), 
and correspondingly c, d elements of A, then dE = I) belongs to A' and 
therefore 


cAdB = cADB = cDAB = (cd)AB 

More generally 

2c, Aj Sdj Bj =• S (c, dj)A, B, 

I 

Sdj B, Sc, A, = 2 (c, d i )B j A, 

If therefore each A, is commutative with each B,, then Sc, A, and 
Sdj B, are also commutative Hence c v E + c, A 4 + c jn A m and 

d„ E + d, A -j- + d,. A 1 are commutative Let now 

f{x) = a u f a, x + + a m x m 

run over all the polynomials of A[ar], then 

/(A) = a 0 E -f a, A + + *m A m (14) 

runs over a system of matrices of degree n which form a commutative subrtng 
R(A, A) of i?(A, n) Since R( A, n) has a basis of n 2 elements, the elements 
(13) cannot be independent Hence there exists a polynomial f(x) which 
is different from the zero-polynomial such that 

m = o 

A is therefore a root of a polynomial, but it is not necessarily a root of a 
polynomial which is irreducible in A[x], as /i(A) /„(A) — O does not imply 
that /,( A) or /A A) is equal to 0 Although R( a,A ) is a commutative ring 
containing the umtelement E, it might not be an integral domain 
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If two fields Ai and A 2 have the same primefield, the rings R(Ai, n) 
and R( a 2 , n) have the same zero-element 0 and the same unitelements E 
When different zero or unitelements are considered simultaneously (e g 
in 6-21), a proper distinction by indices will be made , otherwise the notations 
O and E will be applied 

6-11 The group G(K,n) If n > 1, the ring R( A, n) cannot be mapped 
isomorphically on A, as A is an integral domain, and R( A, n) is non- 
commutative There exists however a non-isomorphie mapping for which 
the multiplication is invariant but the addition is not The mapping is 

A — » det A (1) 

Indeed it has been shown in 1-(1 1)3 that 

det (AB) = det A det B 

The matrices of rank < n are mapped on 0, whereas the matrices of rank n 
are mapped on the elements =£0 of A Consider now the case when a is a 
field K The matrices of R(K, n) having rank n are mapped on the elements 
0 of the field K These elements of K form a multiplicative abelian group 
(see 2-12), as the product of any two elements 0 of K is again such an 
element, and the multiplication is commutative, associative and satisfies 
the law of inverse existence The system (?(K,n) of the matrices which aie 
mapped on that abelian group has similar properties, only the multiplication 
is not commutative To characterise the nature of that system the follow- 
ing definition will be used 

Definition A system G of elements is said to be a group if every 
ordered pair of elements a, b of G can be composed to an element ah of G 
and the composition satisfies the following conditions 1 The composi- 
tion is associative 2 There exists an element e in Q (the umtelement) 
such that ae = a = ea holds for every element a of G 3 To every 
element a of G, there corresponds an (inverse) element <r l of G such 
that aa' 1 — e = a' 1 a holds 

There cannot exist more than one unitelement m G, for if e and e' are 
unitelements, ee' must be equal to e and to e'. Moreover there exists only 
one inverse element, because ha = e implies baa' 1 = ea' 1 and therefore 
b — a 1 . Similarly one sees that ax = b has, for given a and b, one and only 
one solution namely x = a~ l b Correspondingly y — ba~ l is the only 
solution of ya = 6. 
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In the particular cases when the composition is commutative, the 
conditions (4m), (5m), (6m) of 2-11 are satisfied ; if one uses the sign of 
addition m place of the sign of multiplication, those three formulas are 
transformed into (4a), (5a), (6a) An additive commutative group is there- 
fore a module , on the other hand every module is obviously an additive 
commutative group This justifies the notation “abelian group” introduced 
in 2-12 The terms “abelian group” and “commutative group” are of 
course synonymous Thus the notion of group which has been introduced 
here appears to be a generalisation of the “abelian group” (module) which 
has been very often used m the eat her chapters A more comprehensive 
study of the fundamental notion of group, is expected to be given in the 
second volume of this book 

Theorem The system G(K,n) of the matrices ovei the field K of degree n 
and with determinant ^ 0 is a: group, the matrix multiplication being the 
composition 

Proof That the matrix multiplication is associative, and that there 
exists a unitelement, namely E, has already been proved It suffices there- 
fore to show the existence of an inverse element, A ] satisfying 

A A 1 — E A ' A, (2) 

but it has been shown already in 1-(11)4 that when A\ denotes the 
cofactor of a' K m A, and b\ — A k , detA, then B — A'' satisfies (2) 
Hence the theorem 

Eiom (AB) B 1 A 1 — E, it follows that 

B 1 A 1 =-- (AB) 1 (11) 

Thus to foim the inverse of a product, one has to put the inverse values 
of its factors, but m the opposite ordei 

*6-12 The ring R{&) The operations of addition and multiplica- 
tion as defined in 6-1, (1) and (10) cannot be applied without restrictions 
to non-square shaped matrices To apply 6-1, (1), one must suppose that 
the two matrices A and B have the same number of rows, and that they have 
the same number of columns, w r hich may be different from the number 
of rows Moreover 6-1, (10) can be applied- if and only if the number of the 
rows of A is equal to the number of the columns of B , both the conditions 


•May be omitted at a first reading. 
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Gannot hold simultaneously, unless the two matrices are square shaped and 
of the same degree This difficulty can be overcome to a certain extent 
by a method similar to that applied for the addition and multiplication of 
polynomials (see 2-32) It may be remembered that one is allowed to add 
higher terms with zero-coefficients to a polynomial f[x), or one may omit 
such terms, and that this operation does not alter f(x) Similarly one may 
supplement a matrix by putting some rows composed of zeros below' it, and 
similarly extending it to the right side by columns of zeros , all these matri- 
ces may be considered to be equivalent In doing so, one replaces the 
investigation of the matrices by that of classes of matrices, which differ 
only by zero-rows (below) and zero-columns (on the right side) An equality 
of matrices defined in this mannei, obviously satisfies the conditions for 
equivalence 2-12 


Let now' A and B be two matrices over A , by adding zero-rows and 
zeio-columns one can extend them to square shaped matrices A' and B' 
of a sufficiently high dcgiee, say n' and to square shaped matrices A" and 
B" of any degree n" > n' Let 

A' + B' = S', A'B' = G' 

( 1 ) 

A" + B" = S’ A"B" = G\ 


then one gets S’ by adding n" — n' rows below and the same number of 
columns on the right side to S', and in the same mannei one gets G’ from G' 
Thus (1) defines the addition and the multiplication of classes of rectangular 
matrices of any number of row's and columns, the result being independent of 
the choice of the square shaped representatives A' and B' of the tw r o classes 
As addition and multiplication defined in this manner satisfy (3), (4), (11) 
and (12) of 6-1, these classes form a ring R( A) This ring is non-commuta- 
tive and docs not contain a unitelement, though to every element of the 
ring there exist unities reproducing it (but not every element of R( a) ) 
by multiplication Let eg a, (3 and y be the elements of 12(A) which are 
represented by the matrices 


c KiiM:::) 


respectively Then 


ay = y« = a, 


whereas 


Pr 


(\ l o\ /ll i\ 

= ( 1 1 0 ) and yp = f 1 1 1 ) 

\1 10 / \0 0 0 / 
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It seems to be reasonable to overcome this difficulty by extending every 
finite matrix to an (enumerably) infinite matrix by adding zero-rows and 
zero-columns, and to consider matrices with an (enumerable) infinity of rows 
and columns only The infinite diagonal-matrix where all the diagonal- 
elements are equal to I is a unitelement in this system This procedure 
shows much analogy to the step leading from polynomials to power series, 
and as in the theory of power series, considerations about convergence be- 
come necessary here Investigations on infinite matrices and the correspond- 
ing vectorspaces however lie beyond the scope of this book 

6-13 ' Notations and formulas A matrix can be subdivided by hori- 
zontal arid vertical lines into smaller arrays, and these again can be consi- 
dered as matrices (sub-matrices) and denoted by capital letters This 
procedure will be applied here to square shaped matrices only, and as a 
matter of convenience, the intersection by horizontal lines is supposed to 
be symmetric to the intersection by vertical lines , thus the matrices m the 
diagonal are always supposed to be square shaped 

It is not always convenient to give a particular notation to each element 
or to each submatrix in the array of a matrix Portions left empty will 
be considered to be filled with zeros, whereas those portions which are 
marked with asterisks are occupied by elements of any kind (zeros or non- 
zeros) Using these notations, one gets the following formulas immediately 
from the definition of multiplication 



69 0. P.— 32 



250 


ALGEBRA I 


In these formulas matrices with the same index are supposed to be of the 
same degree (which in particular may be equal to 1) Denote the two ma- 
trices on the left side of (2) by A and B respectively, then 


B-’AB 


Br 1 A, Bj 


B„ 1 A„ B e 


(4) 


If one interchanges the rows and columns of a matrix, say A, one gets 
the transposed of A, denoted by 



Apply a notation introduced in 1 -(1 1)1 1 



These matuces are very useful foi representing n - vectors For abbrevia- 

tion, (x) is often called a colvmn -vector, and (a) T a row lector 

Let A and B be matrices over a field K and of degree n, let B be also 
of rank n, then B' 1 exists, and 

A' = B’AB (7) 

is also a matrix over K of degree n A' is said to be the transform of A by 
B and tp be similar to A This will be denoted by 

A' ~ A (8) 

Similarity satisfies the conditions for “equivalence” (see 2-13) Indeed 
A = E~’AE A (law of reflexivity) (7) implies A = BA' B' 1 , hence 
A — A' (law of symmetry) (7) and A" =, C' 1 A'C together imply 
A* = (BC)* 1 A(BC) A (law of transitivity) Hence similar matrices 
form classes. 
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Let f(x) = a 0 + + . . + a n x n , and /(A) = 0, then 0 = 

B' 1 /(A)B = B-' a n EB + B 1 * o, AB + + B* 1 a n A"B = /(B- 1 AB). 

Hence 

Theorem If A is a root of a polynomial, the matrices similar to A 
are roots of the same polynomial 

In some respect, the properties of the classes of similar matrices are 
more interesting than the matnoes themselves as it will be seen from the 
following pages 

1 

6-2 Transformation of vedorspaaes 

Let W be a vectorspace of rank n over the field K (see 2-61), and let 

(1) 

be a basis of W Then every element of W can be expressed by 

8 i + + *u *»> (2) 

where ,x B are elements of K Thus every element of W can be 

represented by an n-vector (a 1 !, , x„) as considered in Chapter I, but 

this representation depends on the choice of the basis (see 2-61) If (1) 
is selected as the basis, then the vectors r, are represented as unit-n- vectors, 
but if a different system of n independent vectors is chosen, then the vectors 
of the new basis are represented by umt-w-vectors Let e g 

01 , ,Pn ( 3 ) 

be another basis of W, where 

/?* = S b\ f„ (4) 

- i 

then the elements b\ belong to K, and they form a matrix B, with 

detB =f= 0 (5) 

On-the other hand (5) implies the independence of the vectors (3) ; hence 
(5) is the necessary and sufficient condition for (3) forming a basis of W. 
Every vector of W can be represented as a linear function of the basis (1) 
as well as of the basis (3) Compare the coefficients of these functions, 
i e the coordinates of the representing w-vectors 

2 «i = 2 y k A. = 2 b‘k y k 8 i 

1 k ktl 
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As the vectors e, are independent, for 1 = 1, . , n, 

K !/k 

k 

holds Again every w-vector can be represented by a column-vector* (see 
6-13, (3)) Thus the last equation can be replaced by a matrix equation 

(*) = B(y) (6) 

and therefore 

(y) = B' 1 (x) (6') 

From these considerations follows 

Theorem 1 Let (1) and (3) be two bases of the same vectorspace W , 

and let them be interconnected by (4) If the same vector is represented 

by (x) in the system (1) and by (y) in the system (3), then (x) and (y) are 
interconnected by (6) and (6') 

In particular when the basis (1) is used, the vector e k is represented 
by the k" 1 unitvector and fH k by the k 1 " column of the matrix B, where- 
as when (3) is used, £ k is represented by the k"‘ column of B 1 , and (3 k 
by the k ,u unitvector 

Consider now a linear transformation A of the vectorspace W By 
A, the vectors forming the basis (1) are transformed into certain other vectors 
of W, which again can be represented by the basis, say 

f k -» «k = 2 a'k (?) 

1 

then an arbitrary vector £ = 2 x it r k is transformed into 

£' = 2 a'k *k e, = 2 x\ e j, 

l. k 

where x\ = 5 x k (7’) 

This formula can be written as an equation between matrices 

(*') = A(x) (8) 

Hence, using any particular basis (1), a linear transformation of IT can be 
expressed by (8) On the other hand, if a matrix A over K is given, (8) 
determines a lmear transformation of W (see 1-11) If in particular 


* Instead of using column-vectors, one can represent the n-vectors by row- 
vectors (x) T and (y) T which are interconnected by (x) T = (»/) T B r 
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detA yt 0, the vectorspace is mapped on itself, and there exists an inverse 
operation to A which is itself a linear transformation If det A = 0, then 
W is mapped on a subspace W' whose rank is equal to rank A If one uses 
a different basis, say (3), then | is represented as an w- vector (y) and £' 
by ( y ') From (6) and (8) it follows 

(*) - B (y), (x’) = B(y') = A(x) = AB(y) 

Therefore (y') — B~ ! AB(y) (9) 

Hence 

♦ 

Theorem 2 If, under the suppositions of theorem 1, the linear trans- 
formation A of IF is expressed by (8) when the vectors are represented by 
the help of the basis (1), then A is expressed by (9) when the same vectors 
are represented by the help of (3) 

Again, let (I) denote any n independent vectors of W, then a linear 
transformation A is uniquely determined by the transformation of vectors 
(1) By comparing the formulas (7) and (7') and putting a\ — c k , one 
obtains therefore immediately 

Theorem 3 If by a linear transformation A, a set of n independent 
vectors (1) is transformed 

e, » 2 c\ e u , 

then the transformation A is represented by the transposed matrix C T 
when (1) is used as the basis 

If m (9), B runs over all the matrices over K of degree and rank n, then 
(3) runs over all the bases of W, and B~'AB over all the matrices similar 
to A (see 6-13) Thus to a linear transformation A there does not corres- 
pond a single matrix A, but the full class of similar matrices On the other 
hand, a matrix„say A does not coriespond to a single linear transformation 
A, as it generates different transformations, according to the different 
bases used for it Of course the matrix A generates A if the basis (1) is 
used , if however a different basis, say (3) is used, A generates the same 
lmear transformation as is generated by the matrix B A B 1 m connec- 
tion with the basis (1) The matrices which represent this transformation 
by the help of all possible bases, are the matrices which are similar to A 
Thus they form the same class of matrices, as those which represent A 
Hence there is a (1, lj-correspondence between the classes of similar matri- 
ces and classes of lmear transformations, but a correspondence between 



264 


ALQEB&A 1 


single matrices and transformations needs the distinction of a particular 
basis. This interconnection shows the importance of the notion of simi- 
larity of matrices 

Let A and B be two linear transformations which, by help of basis (1), 
are represented by A and B respectively Suppose that when £ runs over 
the vectorspace W, 

A ‘ B £->- £2, £1 -»- £ s 

Then one gets linear transformations £ £ s to be denoted by 

B A, (10) 

and (for any pair a, b of elements of K) £ ->- er£, + &£» called 

aA + lB (11) 

When the basis (1) is used, (10) is represented by BA, and (11) by aA + bB 
The linear transformations of W therefore form a ring by using any parti- 
cular basis, this ring is mapped isomorphically to i?(K,n) Different bases 
furnish m general different isomorphisms The umtelement corresponds 
to E, whatever basis may be used, and it is therefore denoted by E 

6-21 Permutations as linear transformations Let 



be a permutation of n objects as considered in (0-3) Then there exists 
a linear transformation of a vectorspace of rank n over K which interchanges 
the vectors of a particular basis fj, , f„, correspondingly 

->- £ "u . (2) 

for k = 1, , n Then 3 x k c k 2 x k c,i k = 2 x'j e jt where 

x'a k = x k , since a k = j 


Hence 

(x') = P(x), where (3) 

P = ( (P‘k) ). and pV = 1 for 1 = o k , k = 1, . . , n 


= 0 for 1 a k . 


w 
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The matrix P has in every column exactly one element which is equal to 1, 
the other elements are equal to 0 In the k’" column, the element 1 stands 
in the row a k As the numbers a,, take every value 1 , , n once and 

only once when k runs over 1 , , n, there is in every row exactly one 

element which is equal to 1 Let now Q be any matrix of degree n which 
shows in every row and in every column exactly one element I the other 
places all being occupied by zero-elements , if m the k ,!l column, the element 
1 stands on the place b„, then b, , , b„ form a pei mutation of 1 , , n 

Hence the permutation | k £i, k of the basis is effected by the linear trans- 
formation (x') = Q(r) Thus the matrices which have in every row and in 
every coldmn one element equal to 1 and all other elements equal to 0 are 
exactly those which effect a permutation of the corresponding basis Jf 
one computes dct P as the sum of id products of elements taken of n diffe- 
rent low's and n diffeient columns, one finds that only one of these products 
is different from 0 


det P = ± r J[p u = ± 1 (5) 

Let in paiticulai P icpresent a transposition (i, k), then p\ — j/, = 1, 
and p‘ , = 1, for i 9 ^= j ^ k, whereas the other elements are zeros In 
this case, det P = — 1 Consider now 2 permutations of the same basis . 

and let P and Q be the corresponding matrices, as above Perform at 
first TTi and then 7T-! ( 8CC I’ ^)> then one gets the permutation 

) which transforms £ k e b 

»k 

Let ( x ') = Q(r') , then x'„ = x\ Hence 

(x") = QP(x) and x\ = x k 

Thus the composition of two permutations corresponds to the composition 
of the matrices representing them The theory of permutations appears 
to be a special case of the theory of matrices, every even (odd) permutation 
is composed of an even (odd) number of transpositions , the determinant 
of a matrix corresponding to a transposition is equal to — 1 . Hence a 
permutation is even or odd according as its determinant is equal to + I 
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or — 1 As a corroUary of the preceding considerations, one finds that the 
two products of two matrices of degree n which have in every row and every 
column exactly one element equal to 1, the other elements being 0, are 
matrices of the same type It is easy to see that these n ' matrices form 
a group (see 6-11) 

Again consider the matrix P representing the permutation *J[ ^ of the 
basis-vectors Ej , , r n The same transformation of the vectorspace is 

represented by B _1 PB if the basis 6-2, (3) is used In general, this matrix 
does not represent a permutation, as the vectors of the new basis will not 
be interchanged by the transformation, but transformed into other vectors , 
if however the two bases differ by the order of the vectors only, B _1 PB 
represents the permutation of the vectors 


E g one can select B m such a way that the different cycles which consti- 
tute the permutation 'Jf , form sets of consecutive indices of (6) Let 
b lt , 6 r form a cycle, then 


B iP B = 


(r.) 


where 


is a matrix of degree r 


If B is arranged according to its cycles and the number of the cycles is m, 
then 

| C | 

I 

B“ ! P B = I (8) 


where the C’s are matrices of the type (7) 


Example 


/ 1 2 3 4 5 \ 
i \ 4 6 2 1 3 / 


= ( 1 , 4 ) ( 3 , 2 , 5 ) 



LINEAR TRANSFORMATIONS 


257 


0 

0 

0 

1 

0 

0 

0 


0 

0 

0 

0 

0 

0 

1 

1 

0 

0 

0 

0 

0 

1 

0 

0 

0 


Hence P = 


To have the cycles formed by consecutive digits, one has to interchange 
2 and 4 Hence put 


B 


1 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

J 


= B’ 1 


Then 


where C 


B-< PB 


-(%)■ 




Again, consider (8) If Gf 1 C, G, 

G, \ 1 / C, 




H,, then it follows from 6-13, (4) that 
Gi \ /Hi 


G„ 



H„ 


To transform P into a simpler form, it suffices to investigate the transforma- 
tion of the matrices (7) which correspond to a cyclic permutation of the 
vectors e lt , e T It is obvious that the vector e t + . + e r is trans- 

formed into itself by C Investigate now whether m the vectorspace V 
over the field R of the real numbers which has the basis e l} , e r there 
are other vectors £ ^ 0 which arc transformed into a vector A£, where A 
is a real number Let £ = 2 X Cj, + . -f z r r r , then it follows from (7) that 

A z x = z r 

A z l+1 — z t for i = 1, .... r — L 

69 0. P.— 33 
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Henoe A r = 1. As A is supposed to be real, there are two cases 


1 X = 1, £ = z(«i + . .. + e r ) 

2. r = 2m, X = — 1, £ = a(« 1 — e 2 + • • — * 2 m)- 

If £ -*■ ± £> then — | ± — £, thus the arbitrary factor z can be omitted. 

Two cases have to be considered now • 

1. r = 2m + 1 Basis y 0 = + . . + e T 

Vi — e i — e ltJ , for i = 1, .. , 2m. 

Transformation • y 0 y 0 

yi Vi*!, for i = l, 2m — 1 
y y \ y 2 • . y^m* 


C- H 


-Cv> 


where L' = 


2 r = 2m + 2 Basis 8, = e l + e 3 + + r 2ln , 2 

== e l "f e 2m*2 


yi = e, — e M 4- i — pi-8,, for 1=1, . , 2m 

m+1 


Transformation • 8j S,, 8 2 -v — 8 2 

Ti ?m> for i = 1, . , 2m — 1 


y2m 2 y 2 k-i 

i 


C~ H' 




where L’ 


Let A be a hnear transformation of a vectorspace of rank n over the field 
R of the real numbers, and let n independent vectors be interchanged by A, 

the permutation being ^7" * suppose * 8 composed of r = s + t cycles, where 
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s is the number of the odd cycles (which are indeed even permutation !) and 
t the number of the even cycles Then one can represent A by the matrix 
(7), and by a further transformation one gets a diagonal-system of matrices 
where the C’s are replaced by H’s From a further transformation which 
interchanges the basis-vectors only, results the representation 



where the unit-matrices E r and E. are of the degree of their indices, cr ^ s is 
the number of the even cycles of more than two elements, and r g t the 
number of the odd cycles of more than one element. 

Consider e g a 5-dimensional Euclidean vectorspace in which 5-vec- 

tors are interchanged by the permutation TTi as above Then this trans- 
formation can be expressed by 



The reader may give a geometrical interpretation to this result 

6-3 The characteristic polynomial of a matrix 

Let A be a linear transformation of a vectorspace of rank n over a 
field K It has been shown already m 6-21 for particular cases that some 
vectors may be invariant or may take only a factor A which is an element 
of K. This question will now be investigated sj^stematically. 

Using any particular basis, the transformation A is expressed by a 
matrix A. If a vector | = (x u is transformed into A £, then 

A| = A(£), or, as A £ = AE £, 


[A - A E] { = O. 


( 1 ) 
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A vector £ 0 with this property exists if and only if the matrix A — X E 

is of rank < n, or 

det (A — X E) = 0 

Let x be an indeterminate over K, and put 

XaM = (A — xE), (2) 

then X A (x) is a polynomial of K[x], it is said to be the characteristic polyno- 
mial of A, and the equation Xa(x) = 0 is called the characteristic equation 
of A Hence 

Theorem 1 The necessary and sufficient condition that a linear trans- 
formation A, expressed by a matrix A, transforms a vector £ ^ 0 into X £ 
is that X is a root of the characteristic polynomial Xa(x) 

As similar matrices correspond to the same A, it follows already from 
th 1 that the characteristic polynomials of similar matrices have the same 
roots , it will be shown now that these polynomials are identical Indeed 

det (A — xE) = (det B)" 1 det (A — xE) detB 
= det [B _1 (A - xE)B] = det (B' 1 AB - xE) 


Hence 

Theorem 2 Similar matrices have the same characteristic polynomial 

Thus the characteristic polynomial does not characterise the single matrix 
but a class of similar matrices, though even these classes are not uniquely 

determined by their characteristic polynomials E g the matrices (M 

have all the characteristic polynomial (1 — x)‘ l , whereas obviously they 
are not all similar to each other as the unit-matrix is not similar to any 
other matrix It may be remarked that the degree of Xa(x) is equal to 
the degree n of A, and that the highest term has the coefficient (—1)“ 

Suppose Xj, .. , Xm are different roots of x A (x) and £,, . , £ m are 
vectors^ 0 for which A £, = X ( £ t holds Suppose the m vectors are 
dependent, then there exists a subset of them, say £,, . . , £ r which is de- 
pendent, whereas any r — 1 of these vectors are independent There 
exists therefore one and (up to a common factor) only one equation 

' J/ °1 £l + •• • + °t =* 0 


( 3 ) 
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between them. Multiply this matrix equation from the left side with A, 
then 

a i A .1 li + + «r A r li = 0 (4) 

Since the elements Ai are supposed to be all different, it follows from 
(3) and (4) that |,, . , | r are dependent, contrary to the supposition 
The |’s are therefore independent Hence 

Theorem 3 If A (£,) = y, for 1 = 1, . , m and the m roots A, are 
all different, then the corresponding vectors |, are independent 

t 

6-31 Characteristic polynomials with n different roots 

Let A be a matrix of degree n over K, and X A (x) its characteristic 
polynomial From the fundamental theorem of general algebra (see 2-6) 
it follows that there exists an algebraic extension A of K such that 

X A (x) = (- 1Y(* - K) (x - A n ) (1) 

Consider now A as a matrix over A, and investigate the transformation 
of the vectorspace V of rank n over A by the class of matrices over A 
which are similar to A and correspond to a linear transformation A To 
every root A,, there exists at least one vector |, O in V such that 
by Che transformation A 

I. A, i, (2) 

holds Suppose now that the n roots are all different Then there exist 
corresponding to the n different roots also n different vectors |, satisfying 
(2) and from 6-3, theorem 3 it follows that these vectors are independent , 
hence they form a basis of V Eveiy vector a of V can be represented by 

a = 2 x, |„ (3) 

and from (2) it follows that a £ x, A, || holds Hence the transformation 
A is expressed by the matrix 



when is used as basis of F, and therefore 


A L 


( 5 ) 
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The formula (2) can also be expressed by L = X, = X,E|, Hence 

(L - X, E) f , = O 

L — Xi E is a linear polynomial in L, and belongs to the commutative 
ring R(A, L) (see 6-1), hence these factors are commutative Therefore 

7T(L - Xi E) Ij = 0, for J = 1, . , n (6) 

i 

From (1) it follows that the product on the left side of (6) is equal to Xa(L). 
Hence Xa(L) is a matrix which transforms every vector of the basis, and 
therefore every vector of V into 0 The rank of X A (L) is therefore zero 
(see 1-5) i e X A (L) = 0 Hence L is a root of X A (-*0 . thus it follows 
from the theorem at the end of 6-13 that 

Xa(A) = o (7) 

This formula has been established here under the supposition that the roots 
of Xa(-t) are all different, but m 6-32 it will be proved without restriction. 


6-32 Multiple roots of a characteristic polynomial 


The results of 6-31 must be modified when the characteristic polynomial 


has multiple roots 


The matrix B = 



cannot be transfouned 


into a diagonal-matrix, as such a diagonal-matrix is bound to have the same 
characteristic polynomial as B, i e it must be the unit-matrix which however 
is not similar to any other matrix Moreover it is easily seen that there 
exists only one 1 -dimensional subspace which is invariant for B, whereas 
when a matrix of degree 2 has a characteristic polynomial with two different 
roots, two such subspaces exist The generalisation of the results m 6-31 
which will be proved now is the following . 


Theorem Let A be a transformation of a vectorspace F of rank n 
over the field A, let A be a matrix expressing A, and 


XaI 3 ') — (Xj . . (x m x) r m, (i) 

where X u . . , X m belong to A, then F is generated by m vectorspaces 
V u . . , F m with the following properties (for j = 1, . . , m) • 

(1) Vj is of rank rj and invariant for A 

(2) . Vj is transformed into O by 

(A - X E) r J 


(2) 



CHARACTERISTIC POLYNOMIALS 


263 


and every vector with this property belongs to Fj. 


(3) If . . , e J rj is a basis of F,, then 


is a basis of F, and for this basis, A is expressed by 


( 3 ) 



( 4 ) 


where Lj is of rank r J5 and X tj (x) = (Xj — x) T i Moreover X a (A) = O 


Proof As Xj is a root of X A (x), there exists a vector, say (3 i # O 
which is transformed into Xj/Sj by A Choose /?, as the first element of 
a basis of F, then it follows from 6-2, theorem 3 that A is then represented 
by a matrix 



and A ' A t The corresponding holds for every matrix of any degree 
if X, is a root of its characteristic polynomial Now 

XaM = Xa x (x) = (X, - x ) det (A' - \x) = (X! - a;) X A '(*) 


Hence 

Xa'M = (X. - Z) r ' _1 (A 2 - (A m - x)'n , 

If r, > 1, then X, is a root of Xa M Hence A' is similar to a matrix 
of degree n — 1 and of the type (5), say 


B " U f I-) PU ‘ B -“C b) 


from 6-13, (1) it follows that 

Bj * Aj Bj — A 2 — 


X * * 

X * 


A' 
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and therefore A ^ A fl Tins procedure can be repeated r x times The 
result is 


A * 


* 


A A ri = 


( 6 ) 



where A (r> is a matrix of degree n — rj Formula (5) shows that (6) holds 
also when r x — 1 Let 


Pit . Pv^ Vl> > Vn-r, 


(?) 


be the basis of F corresponding to the representation of A by A, ^ , then 
Pu ■ i Pr L generate, a vectorspace F, of rank r x and it will be proved to 
have the properties stated m the theorem 


(1) A ri transforms /L A t p t 


Pi Aj p 2 + a Pl 


P*l Aj + C x Pr^l + + Cr n -l Pi 


( 8 ) 


Hence £6„ p„->- %d h fi k F, is therefore mapped on itself by A rj) and as A ri 
is the matrix representing A for the basis (7), the vectorspace V t is 
invariant for A 

✓ 

(2) From (8) follows that 
(A rj - A, E) pi = 0 

(A ri — Aj E) =«/?!, hence (A rj — E) 2 fi 2 = 0 


By repeating this procedure, one gets for k = 1 , . , ri 

(A ri - Aj E) k /? k = 0 

and therefore (A rj — A t E) r i /? k = 0. 

(A ri - A x E)'i £ = 0, 


Hence 
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for every vector | of V lt and therefore (A — A t E) r > maps F 4 on O. 
Suppose now that there exists in V a vector a which does not belong to 
V 1 but is mapped on O by a suitable power of A — A x E, say by (A — A. 1 E)*. 
Consider the sequence of vectors on which a is mapped by (A — A x E) k 
for k = 0, . t The first of these vectors lies outside V 1 , the last is O 
and therefore belongs to F x Select the last of the vectors not belonging 
to F, and call it /?r 1+ i ; thus /3^+t is mapped by A — A X E on a vector 
of F x Now there exists a basis /? x , . , (3 r , /?r 1+ i 8 X , . , S„_ ri -, of V For 
this basis, A is represented by 



where S is of degree n — r x — 1 

C A Hence x A (*) = Xc(x), but Xc(x) — (A x — a:) r i* 1 det (S — \ x E), 
and this contradicts (1) Hence there exists no vector m F outside V t 
which by a suitable power of A — A x £ is mapped on 0 Hence V t has 
the properties required by (2) In the same manner, one finds subspaces 
V 2 , , V m of F with the properties (1) and (2) 

(3) To show that F x , . , V m generate V, and that a basis c' x , , r m r n , 
as required exists, one has to show that 

I, + + U = 0 (9) 

implies | x = = £ m = 0, when belongs to F, (for l = 1, , m) For 

this purpose, consider the polynomials f |(x) — X A (x) (A, — x) r > The 
h c f of these m polynomials is of degree 0 Hence (see 2-47, theorem 2) 
there exist in \\x] polynomials k L [x), , k m (x) such that 

k i( x ) + + fmW k„,(x) — 1 

Hence 

<Ai(A) fc x (A) + + 'MA) k m ( A)= E (10) 

Since for iytj, ^ X (A) is divisible by (A — A, E) r i, there is fj(A) ( } — 0 
By multiplying (10) with from the right side, one gets therefore 
^j(A) fcj (A) = | j If in particular the vectors li, . . , f ra are those satis- 
fying (9), for every particular j, there is 

0 = *i(A) *i(A) S*i = <h(A) *,( A) £, = 


69 0. P.— 34 
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Henoe (9) implies that £ L = . .. = f m = 0. Select now in every Fj a 
basis e J i, . e f r .. Then the n vectors e 1 ,, . . , e m r ^ are independent and 

since n is the rank of V, they form a basis of F. For the transformation 
A, each of the spaces Fj is invariant , therefore 

£, k - * a J £l i 4" • + 

Therefore, if one selects the basis e 1 l , . , e“ r to represent the transforma- 
tion A, the matrix must have the form (4) In particular, select 
e'j = p u t 1 r i = /?rj ; then it follows from (8) that 



hence X Ll ( x ) = (Xj — ar) r i , but this result is independent of the choice 
of the basis of V\ Similarly X l1 (j;) = (X, — f)'», for l = 1, , m As 

ei u is transformed into O by (A — Xj E) r j, it is also mapped on O 
by x a (A), and this holds for every ] Hence every vector of F is mapped 
on zero by x a (A), and therefore x a (A) = 0 Hence the theorem holds 

6-33 Transformations with characteristic polynomial (X — x)' 

In the preceding article, the results of 6-31 have been generalised in three 
directions Firstly it has been shown that to each of the different roots X| 
of the characteristic polynomial, there eonesponds an invariant subspace V, 
whose rank is equal to the multiplicity r t of the root , V t is characterised 
by the property that it is mapped on 0 by (A — n E) r ‘ Secondly, 
the equation X a (A) = 0 holds unconditionally Thirdly, every matrix 
can be transformed into a diagonal system of as many matrices as the 
characteristic polynomial has different roots, the degrees of these matrices 
are equal to the multiplicities of the different roots and the characteristic 
polynomials are (X ( — x) r i Out of these three items only the second 
one seems to be a full generalisation of a result of 6-31 (see however 6-35) 
Consider the first and the third way of generalisation If r t = 1, then F| 
is mapped on O by every positive power of A — X t E , in general however 
one knows only, that F is mapped on O by (A — Xj E)* for e ^ r,, but 
it is not known whether such a mapping can be performed with e < rj. 
Of course it will bb shown here that different cases must be distinguished. 
When all the roots of X A (*) are different, A is similar to a diagonal-matrix 
of which every element is known ; when the roots are not all different, the 
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preceding results show only that A is similar to a matrix consisting of a 
diagonal system of matrices, and of these matrices only the characteristic 
polynomials (A, — x) r i are known. 

To supplement the results of 6-32, it is therefore necessary to investi- 
gate the subspaces F, and the matrices L ( for which Xh t (x) = (Ai — ar) r i , 
since when F is transformed by A (using a particular basis), then F| is 
transformed by L ( If on the other hand, one finds a uniquely determined 
distinguished matrix H, which is similar to L t (for i = 1, . ..,m), then A 
is similar to the matrix formed by the in matrices H| written in the diagonal. 
Thus if “canonical” forms for those matrices where the roots of the charac- 
teristic polynomial are all equal are found out, one gets automatically cano- 
nical forms for all the matrices 

Suppose now that L is a matrix of R( A, r) let 

Xl(*) = (A - x)', (1) 

where A belongs to A The corresponding vectorspace of rank r over A 
w ill be denoted by V If for any vector £ of V 

(L — A E) e | = 0 (2) 

holds, then the same equation holds for every exponent k > e Let e be 
the smallest positive integer for which (2) holds, then e is called the 
exponent of £, 

exp £ = e. (2') 

Since (L — A E) r = (— 1)’ X l (L) = 0, e ^ r for every vector £. If there 
exists a vector 7 in F such that 

(L - A E)‘ ? = £, (3) 

then the same equation can be satisfied for any non-negative integral ex- 
ponent. j < h, as 7' — (L — A E) h ~i 7 implies (L — A E)i 7' = £. Let 
h be the highest non-negative integer for which (3) holds true, then h 
is called the height of £, 

height £ = h. (3') 

Obviously, height £ < r for £ =£ 0. Moreover 

exp c£ ^ exp £, height c £ > height £ 
for every element c of A, but inequality holds in (4) only for c = 0. 


( 4 ) 
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Let exp i 1 ^ k, height 2; j 

exp £ 2 g k, height | 5 j, then for 1 = 1, 2 

(L-AEH, = 0| f (L - A E)‘ (| t ± L) = O 

V and therefore < (5) 

(L — A E)i = £i J [ (L — A E)i ± r] 2 ) = |,± 

From (4) and (5) follows 

Lemma 1 The vectors of V with an exponent ^ k form a vectorspace 

over A say W x , , the vectors of V with a height > ] form a vectorspace 

over A, say W>. 

Obviously, W k Q W k for g < k 

W' C W‘ for i > j 

W r = W° = V 

Consider now any particular transformation, and suppose that r'5r is 
the greatest exponent which occurs, and h' the highest height of vectors 
0 Then there exists a particular vector y u such that 

(L — A E) r -* % — $ 0 =£ 0 , 

thus height f 0 ^ r' - 1 , hence h' g r' — 1 Moreover there exists a vector 
|j O such that 

(L - AE)" ,.=^0, 

thus exp i] 1 > h' + 1 , hence r' g h' + 1 Hence 

h' = r' - 1 (6) 

Let ^O, exp | = e, height £ = h, then (L — A E) 1 ' 1 
exp(L — A E)*' 1 £ = 1, height (L — A E) e_1 £ = h -|- e — l^h' = r' — 1 
Hence 

h' + e ^ r' - (7) 

If m particular e = r', then h = 0, and exp (L — A E)J £ = r' — j, 
height (L — A E)J £ = j Hence every exponent between 1 and r' and 
every height between 0 and r' — 1 actually occur The above inequalities 
can therefore be replaced by 


W % C W k for 0 g g < k 5 r' 
W l c W> for 0 S j < i < r' 


( 8 ) 
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m 


Hence 

W t c W 2 C • C W r ■ = V 
W T 1 c IF r 2 c , C ff° = F 

Denote the meet 

Tf k n W = W> k , (9) 

then Ff\ C W\ for j g i < r', 0 ^ g g k The subspaces W k , W>, W\ 
are independent of any selection of a basis The ranks of these spaces 
are uniquely determined by the transformation L, and it will be shown 
in 6-333 that they suffice to find out a (“canonical”) representation of the 
transformation by a matrix of a particular type , thus these numbers will 
prove t(S be characteristic for L To reach that result, the following 
lemma will be used 

Lemma 2 If 0 < exp |, < < exp then , £ ln are inde- 

pendent 

Proof Let the rectors, be dependent, then one can suppose without 

B 

loss of generality, that £ 1( , £, are independent, whereas = S c t £, 

i-i 

Put exp | s+1 =t-f~l, then (L— A E}' ^+,^0, whereas (L— A. E) 1 2 c, 0 . 
Hence the lemma 

Exercises (1) Establish a statement on vectors of different heights, 
analogous to lemma 2 

(2) Consider the exponent of a vector of V as its “measure” and 
find out the inequalities which hold for the sides of a (degenerate or 
non-degenerate) triangle , establish the corresponding inequalities when 
the vectors are measured by their heights 

(3) Investigate whether corresponding inequalities hold for other 
mathematical entities and suitable operations, e g the ring of the poly- 
nomials K[x], the rmg of the numbers mod p r , the system of curves which 
pass through a particular point P and are differentiable m P to every order. 


6-331 The case r’ = 1 In this case, W x = V Every vector ^ 0 
is of exponent 1 and of height 0 Let /?,, . , /? r be any basis, then 
(L —A E) /?[ = 0 , hence L = A p\, for i = 1, . , r Thus the trans- 
formation is represented by a diagonal matrix 


L = 


A 
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whatever basis may be used In Geometry, this transformation is known 
as a similarity of the r -dimensional space. 

6-332 The case r' = r Suppose there is such a case, then there 
exists a vector, say /?, of exponent r Put (L — A E) r ~ k /? r = /? k for 
k = 1, . . , r, then exp f3 k = k, and height /? k > r — k, but as the sum 
of height and exponent cannot be greater than r (see 6-33 (7)), there is 
height /? k = r k From 6-33, lemma 2 it follows, that the r vectors 

Pu - . P, (1) 

are independent, and therefore they form a basis of V Now (L — A E)/? k 
= /?,, ,, and therefore L ft = A A + A i Thus by L the basis is 
transformed as follows 


Pi h Pi 

Pi Pi + A 

Pk —* Pu-i + A Pk 
P r Pr-l + A P r- 

Hence it follows from 6-2, theorem 3 that L can be represented by 



This formula shows that the case r' = r actually exists, and that for a given 
number r all the transformations under consideration are similar Thus 
there is one canonical form for the matrices in this case, as there is one cano- 
nical form for r' = 1. Similarly by Hr! a matrix of the type (2) with 
the characteristic polynomial X H r i ( 3: ) = (Ai — z) r ‘ will be denoted 

Let a = Cj /3i + . .. + c r p It where C| == 0, for i < r — j and for l > k, 
whereas c r .j and c k are different from 0. Since exp /? 8 = s, and height /3, 
= r — s holds for s = 1, . . , r, it follows exp a = k, height a = j. 
Hence /? r _j*i, ■ ■■, P k form a basis of W > k ; in particular p lt ...,p k 
is a basis of W k , and . . . /? r is a basis of W 1 These subspaces have 
been defined in 6-33 m an invariant manner. 
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6-333. Characteristic polynomials with a single root : general case. Let 
r, 2 fi £ . . . =S r p > 0, and 2 r t = r, then the matrix 



is of degree r, and its characteristic polynomial is (A — .r) r i If p = r, 
then every r K is equal to 1, and this is the case which was considered al- 
ready in 6-331, but if p = 1, it is the case of 6-332 It will be proved that (1) 
is the most general case, i e that every matrix with characteristic polyno- 
mial (A — x) r is similar to one and only one matrix of the type (1). To 
establish this theorem, it is convenient to construct beforehand the inva- 
riant subspaces W 1 , W k , and to investigate their properties when 
a particular transformation L, represented by a matrix of type (1), is given 
As it has been shown in 6-332, every H>, can be represented by a basis 


P\, PS, 

, /?'», for l—l, , p 

(2) 

in such a way that 

(L - A E) /J\ = p k ~\ 

* 



(L - A E) p\ = 0 

(3) 

Hence erp /?*, 

= k, height = r, — k 

(4) 


The r, + 4 r,, — r vectors form a basis of V Arrange these 

vectors in the following manner 

l 8t row /?', p' it , fl' v 

(5) 

i lh column f3 l x , ,P r> i (vertical) 


r> = 5, r, 

= r 4 

~ r 5 

= 3, 

r a = 

--2, r 7 = 1, 

then the scheme 

PS 

PS 

PS 

PS 

P\ 

PS PS 


PS 

PS 

PS 

PS 

PS 

PS 


P\ 

PS 

PS 

PS 

PS 


(5 1 ) 

P\ 

PS 






PS 

PS 







The index of a row shows the exponent, whereas the height is given by 
the number of /8’s standing in the same column below the considered vector. 
The first k rows form a basis of the subspaoe W k . 
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Thus rank W t = number of P’s in the l* 1 row 

rank W k — rank W k - t = number of /?’ s in the k ,h row 

Hence the shape of the triangular scheme (5) is completely determined by 
the ranks of the spaces W k It is therefore impossible that a transformation 
is representable by two non-isomorphic matrices of the type (1) Moreover 
if (5) is a basis of F, and the conditions (3) are satisfied, every column of (5) 
is a basis of a subspace F, of rank r, which is mapped on itself by L, the 
transformation of F, being represented by Hr,, and therefore the transfor- 
mation of F by (1) After these remarks it is easy to prove the follow- 
ing theorem 

Theorem If XlM = (A — r) r , then L is similar to one and only one 
matrix of the type (1) 

Proof It suffices to prove that there exists a basis (5) which satisfies 
the conditions (3) Let rank IF, = p, and r'gr be the highest exponent 
occurmg in F Consider the sequence of subspaces W r ~ l t C C IF 1 , 
= IF, , omit in this sequence all those subspacos which are identical with 
preceding onc3, so getting 

IF r ~', C IF', C C IF', = IF,, (6) 

where r' — I > s > > t g ! Construct now a basis 

, /?'„ of IF' ■*„ supplement it to a basis 

/3 1 ,, , P' t, 1 8'„i, . P ' b of W’ 1 and continue this procedure 

up to it comes out finally as a basis 

P'„ ,P\, ,P\, > P\ (7) 

of IF', = IF, which has the property that to each IF k , there corresponds 
an initial section of (7) which is a basis of IF", Since the vectors p x u , /J' a 
are of height r' — 1, there exist vectors p r such that, for ju. = 1, . , a, 
(L — A E) r _1 P T n = P\ Put (L — \ E) r ~ k P r k = /8 k A , and arrange 
these vectors into columns as in (5) The row-index k is equal to the ex- 
ponent of the vector, and the height is equal to r' — k, which is equal to the 
number of vectors standing in the same column below it Correspondingly, 
there exist, for v = a + 1, , b, vectors P a for which (L — A E) s l p\, 

= p l v , again put (L— A E) s_k P\=P*i’> arrange these vectors in columns 
as in (5) and continue this procedure up to the end of (7) The triangular 
scheme obtained in this way satisfies the conditions (3) , moreover every 
row-index shows the exponent and the height is equal to the number of 
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standing vertically below. It remains to show that all these /9’s form a 
basis of V. If any linear equation holds, it can only be an equation be- 
tween vectors of the same exponent, i e between /J’s of the same row, but 

if 2 Cj /8*j = 0, it follows by multiplication from left side with (L — A E)*' 1 
) 

that 2 Cj /3’j = 0 Now the vectors in the first row form a basis of Wj 
i 

and are therefore independent Thus the /3’s are independent. To show 
that they generate V, it will be proved by mathematical induction that for 
k = 1, . , r', the k first rows generate W k This statement is true for 
k = 1 , suppose it to be true for k = q — 1 and let exp g = q. For 
£' = (L — A E) £, exp £' — q — 1 , hence £ = 2 c„ j /S) For the same 

■<q 

coefficients c„ j, put 2 c„ , /3 1 * 1 , = rj , then (L — A E) (£ — V) = O, and 
therefore exp (| — V) S 1 Hence £ — V belongs to W lt and therefore 
£ — q = 2 d k [3\, , thus | depends on the /3’a m the q first rows. 

k 

Hence the theorem 


6-34 Characteristic polynomials with any number of roots The theorems 
of 6-32 and 6-33 furnish directly the following general result 

Mam theorem on similarity of matrices Every matrix A of R( A, n) 
with the characteristic polynomial Xa( x ) — (Ax — x) r > . . . (A ra — z) T m 
is similar to one and only one matrix up to a permutation at the L „ 
(canonical form) 



Lj being of degree r,, H' k of order r,, k , moreover 2 r,, k — rj, and 

r J.l = rj, a S . .. > Tj, Pj 

69 0. P —35 
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6-341. Application to the theory of the linear substitutions of a complex 
variable. Consider the complex linear substitutions of the type 

w = , where a8 — fly 0 

yz+ « 

Introduce homogeneous coordinates w — w x w 2 , z= z y , then 
«>! = a z 1 + P Zj, u>j = y Zj + S z. 

As a common complex factor of a, /3, y, S is arbitrary, one can arrange 
that aS — /3y = 1 , now only a common factor i 1 is arbitrary. 

/ « fi\ 

Let! 1 — A Xa(*) = x 3 — xx + 1 = (Xj — x) — x) 

\ y s / 

Hence Ai Aa = 1, thus Ai = re'<P, Ai = r _1 
Two cases must be distinguished 

( re>1> 0 \ 

) (1) 

0 r~ J e~ >0 / 

As a factor ± 1 and a permutation of Ai,Ai are arbitrary, choose 1 gr, 

0 ^ <p < ir 

(2) Ai = A 2 = 1, * = ±2, as a common factor = ± 1 is arbitrary, 

suppose without loss of generality that Ai = Aj = 1, k = 2 Thus there 
are two normal-forms 

c;m::) 

These transformations are largely discussed m the elements of the theory 
of functions. The classes of transformations with the normal-form (1) 
are said to be loxodromic, they have two fixed point's, which for the normal- 
form, are chosen as 0 and ce In the particular case where r = 1, the 
transformations are called elliptic, and if r > 1, <p — 0, hyperbolic. The 
first matrix (2) denotes the identity, the second matrix denotes a parallel- 
displacement, the infinite point of the complex sphere being the only fixed 
point and the other transformations of this class are called parabolic trans- 
formations, the only fixpoint being a finite point. 
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6-35 Polynomials of which A is a root Let f(x) be a polynomial of 
A(x), then /(A) is a matrix, and /(A) = 0 if and only if this matrix maps F 
on O, that means if /(A) | = 0 for every vector i of F. If f(x) runs 
over all the polynomials of A[*], the polynomials /(A) form a commutative 
ring B(A, A) (see 6-1) ; hence f 2 ( A) = O implies /i(A) /(A) = 0, and if A 
is a root of /, (x) and f. 2 {x), it is also a root of k t fA x )+k 2 f i {x), hence it is a 
root of the highest common factor (f^x), f 2 {x)) In every matrix-equation, 
the matrix can be replaced by a similar one (see the theorem of 6-13) ; thus 
one may replace the matrix by the transformation ■which it represents, or 
conversely Apply now the notations of (6-33) 

t 

Every vector | of F can be represented by 

• +U (1) 

where belongs to Fj Since the vectorspaces F| are invariant for A, 
the transformation A of the vectors |j is represented by L] Let now 
r'i, r'j, . , r‘pi be the degrees of the matrices H'i, . . , H‘p, respectively, then 

(A - Ai E) r ‘i li = °» f or i = 1, . m, (2) 

but for every t < r 1 , there exist vectors V, of F, such that 

(A - A E)‘‘ (2') 

Put r'p| +1 = r’p,n = . = 0, and 


<Pk( x ) =71 ( x ~ Ai) r ‘k for k = 1, . , , n (3) 

1 

Since r 1 , ^ r' 2 S S r'^jg 0 = r' P|+J holds for every i, in the sequence 
<M*)> <h( x ), . U( x ), (4) 

every polynonnal is divisible by the following one, except in the case when 
the polynomials (4) are all equal, <f> n (x) is the polynomial 1 Moreover 

Xa(*) = 7f M*) (5) 

i 

Since from (2) and (3) it follows that ^(A) |, = 0 for every i, formula (1) 
shows that ^(A) £= 0 for every £ of V. Henoe 

m 


0i (A) = 0. 
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Suppose now that F(x) is a polynomial of A[x], and let F( A) = O Form 
the h. c.f , say 

(F(x), (x — \i) r 'i) = h(x) F(x) -f k,(x) {x— 

= (x — A,) 1 , 

then (jfc,(A) F( A) + k 2 (A) (A - Ai E ) r S) 7, = (x - A, E)* r ]i 

The left side is equal to 0 but from (2') it follows that there exist vectors 
m Vi such that the right side is different from O when t < r‘j Hence 
F(x) is divisible by (x — Ai) r ‘i and — Slnce lfc holds for ever y 1 ~ divisible 
by <h (x) IVom this statement and from (6) follows 

Theorem F( A) = 0, if and only if F(x) is divisible in A[x] by <j> ,(x) 

If the factorisation (6) of x A (x) ln to factors f k (x) (of which in general 
some are reducible, and others are 1) is given, the roots Ai and the expo- 
nents r‘ k are determined, and these again determine uniquely the cano- 
nical form of the matrix Hence the right side of (5) characterises uniquely 
the corresponding class of similar matrices (transformations) In the 
particular case, where the roots of Xa( x ) are all different, Xa( x ) = <pi( x ) , 
thus in this case the class of matrices is uniquely determined by the charac- 
teristic polynomial. 

Let K 13 A, and suppose that Xa( x ) belongs to K[x] , then ^,(x) 
may not belong to K[x] as it is seen by the example 



That however <h(x) belongs to K[x] when A is a matrix over K, will be 
shown in 6-44 


6-4 Elementary divisors 

In 6-2 and 6-3 matrices over a field have been investigated This section 
deals with matrices over a Euclidean domam A , thus the theory to be 
developed here can be applied eg to matrices whose elements are integral 
numbers as well as to the case when the elements are polynomials in x over 
a field. The difference between matrices over a field K and matrices over A 
appears from the following comparison. 



elemektaby divisobs 
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If A runs over all the matrices of Ji(K, n), and | ^ 0 is a vector of a 
vectorspace V over K, then A£ runs over all the vectors of V , moreover 
if B is a particular matrix of R( K,n), then it maps V either on itself (namely 
if det B ^ 0) or on a subspace of a lower rank On the other hand, let A 
be a Euclidean domain which is not a field, and A C K If A runs over 
ft(A, n), then A | runs over the vectors of a modul M of rank n over A 
but M is not a vectorspace , moreover a matrix B of R( A, n) may map M 
on a submodul which is of the same rank n but not identical with M. 

This comparison suffices to show that the transformations to be consi- 
dered now are of a quite different charactei to those treated earlier Corres- 
pondingly a different equivalence of matrices will be used here On place 
of classes of similar matrices, classes of congruent matrices will be investi- 
gated , it must however be emphasised that — unlike in Geometry — con- 
gruence is not a special case of similarity It is better not to think of the 
geometrical significance of these two words when using them for matrices 
Congruence of matrices can be defined either by the help of operations on 
rows and columns, or by matrix multiplication , both definitions lead to 
the same result At first the operations on rows and columns will be used 
(See 1-4 and 2-44) 

6-41 Congruent matrices Let A be a matrix of the ring R( A, n), 
and let a,, , «„ be its column-vectors If c is an element of A and l k 

then the transformation 

«i -*• c a k + a, (1) 

is called a column-addition, whereas a, -»- c is a column-multiplication 
with c , correspondingly the terms row-addition and row -multiplication 
are used These definitions tally to a certain extent with those of 1-4 
In Chapter I, c was supposed to be a number , in (2-61) the generalisation 
from “number” to “element of a field K” was performed, but here, c is 
bound to be an element of a Euclidean domain A If A is transformed 
into A' by a row-(or a column) addition, then A' is transformed into A by a 
transformation of the same type, c being replaced by — c If however the 
transformation A A' is done by row -(column) multiplication, the inverse 
operation is the division of a row (column) by an element c of A, and this 
operation is a row-(column) multiplication if and only if c is a unity of A 
A transformation which is composed of zero or more row-additions, column- 
additions, row- multiplications with unities and column -multiplications 
with unities, is called a congruence. Thus if A is transformed by a con- 
gruence into A', the inverse transformation is also a congruence , i.e. con- 
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gruenee satisfies the law of symmetry. That congruence satisfies also the 
laws of reflexivity and transitivity is obvious Hence congruence is an 
equivalence, and one can form classes of congruent matrices in R(A, n) 
The congruence of two matrices A and A' is denoted by 

A A' (2) 

To find out the invariants for classes of congruent matrices, it suffices 
to determine those properties of a matrix which are m variant for row- (and 
column) addition as well as for row- (and column) multiplications with unities 
The column-addition (1) transforms 

a>i a>t -f c a> k for j = 1, . , n, 

whereas the other elements of A remain unaltered If therefore d is a common 
factor of all the elements of A, then it is also a common factor of the elements 
of the transformed matrix A' ; the corresponding holds for row-addition 
and for row-and column -multiplication Hence if A' is congruent to A, 
every element of A' is divisible by the he f , say 8, of the elements of A 
but as in this case A is also congruent to A', the matrices have the same 
he f (which is determined up to a unity factor only) 

This consideration can be generalised to the h c f, say of the 
minors of degree q Indeed, the elements of A are the mmors of degree 1 
Let M be a matrix formed by q rows and q columus of A If the i lh 
column does not contain any element of M, then M is not altered by 
the column-addition (1) , if both the i"‘ and the k th columns have elements 
in common with M, then M is altered but det M is unchanged ; if the 
1 th column, but not the k th column has elements m common with M, and M' 
is the matrix obtained from M by replacing the i th column by the k th column, 
then detM is transformed into detM + c detM' Thus a common factor 
of the values detM, when detM runs over all the mmors of degree q, remains 
a common factor of them after any column-addition , the corresponding 
invariance holds for row-additions By row- (column) multiplication, the 
mmors take a umtfactor only Hence, if A — A', the highest common 
factor 8„ of the mmors of degree q of A is a common factor of the mmors 
of degree q of A', but since also the converse holds, 8„ (winch is determined 
up-to a unity-factor only) remains invariant. This holds for q = 1, . , , n 
and 8 n = det A 

Let b lf . . ., iq be the elements of any row of M, and B lt . . ., f? q their 
•cofaotors. Sinoe 

detM = b l B t -j- ...-)- 6, B^ 
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holds, and the B'a are divisible by S q _i, so detM is divisible by S,. 1( 
This holds for every minor detM of degree q , hence S q is divisible by S q .i. 
The elements 


Su S 2 , ,8„ (3) 

are called the determinant divisors of A. Using this notation, the results 
obtained here read as follows . 

Theorem. Congruent matrices have the same system (3) of determinant 
divisors. For l < k, S; is a factor of 8 k . 

I 

This theorem applies also to the case when the Euclidean domain is 
a field, but then every element is either a unity or zero Whithout loss of 
generality, one can assume that the determinant divisors are 1 or 0 , if 
any one of the elements of the sequence (3) is 0, then the following elements 
are ako equal to 0 The number of the elements equal to 1, is equal to 
rank (A) Thus the sequence (3) appears to be a generalisation of the 
notion of rank 

0-42 “ Sweep out ” for matrices of J?(A, n) The determination of 
the determinant divisors is much simplified when the matrix is a diagonal- 
matrix It has been shown in (1-4) how a matrix can be transformed by 
row-additions and permutations of columns into a diagonal-matrix, but 
m those investigations, the matrices were supposed to be matrices over a 
field, and the division by a matrix-element is indeed an important step 
when a matrix of that kind is swept out To be applied to matrices of 
Jt( A, n), the method has to be modified , the operation of division will be 
replaced by the algonthmus of the h c f , on the other hand one is allowed 
to make full use of the addition of columns Whereas the considerations 
of 6-41 hold for every integral domain with unique factorisation, it is essen- 
tial for the present section that A is a Euclidean domain 

Suppose a*! = a is the h c. f of the elements of the k lt row of a matrix 
A, say o k i = a a\ for l = 2, , n, then one can “sweep out” the row by 
the n — 1 column -additions a, a, — o i , thereafter the k th row will 
be (a, 0, . . , 0) If a * i is not the h c / of the k tb row, but a k , = a is, 
then o k ! = a(a' + 1), by a, — & a, the element a is brought to the 
firBt place in the row, and one can sweep out the row. The corresponding 
holds for columns. Thus if any element a M „ of the matrix is equal to the 
invariant highest common factor 8 U one can sweep out suQoessively the 
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/t th row, the first column, and then the first row without changing the first 
column. Thus the matrix is tranformed by row-and column-additions mto 



Suppose now that no element of A is equal to 8,, (up to a unity factor) 
Since A is a Euclidean domain, a norm-function (see 2-4) for every 
element exists, and from the supposition follows that N(a 11 ,,) > N(S 1 ), Let 
a\ — a be an element of the matrix with the smallest occunng norm , 
it will be shown now that one can arrange by row-and column-additions 
that an element with a norm < N{a) appears Suppose that a is not 
the h c f of the i' h row, say a\ is not divisible by a, then there exists 
an element 

b — a'j — ca, (2) 

such that N(b) < N(a) The column-addition a s a, — c « k transforms 
a' j into b , correspondingly if there is an element in the k"‘ column which 
is non-divisible by a If however every element of the l" 1 row is divisible 
by a, then sweep out the row , if thereafter in the first column there is an 
element which is non-divisiblc by a, one can apply the method given just 
before to replace this element by another with a norm < N(a) If the 
elements of the first column are all divisible by a, then there must be else- 
where such an element, since a is not an he / , let a % be such an element 
then ju. -/= l, v ^6 1 , now cq a l -f- a,, leaves a 1 ! — a unchanged and 
transforms ->- a" 1 + a% which is not divisible by a (since the first 
term is divisible and the second is not) Thus it is always possible to bring 
the element a in the same row or column with an element non-divisible by a 
and then to replace the latter one by an element b for which N(b) < N(a) 
This method can bo repeated untill (after at most N{a) — 2V(8i) steps) 
an element appears the norm of which is A^S,), and which is therefore an 
he f of the elements of A By sweeping out, one gets hereafter A trans- 
formed into (1) Hence 

Lemma Every matrix A of R( A, n) can be transformed by row-and 
column -additions into 



( 1 ') 


where e* = 8». 
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Now A' can be transformed correspondingly by row and column-additions 
amongst the 2“ d to n rh rows (columns) By these operations the first row 
and the first column are unaltered Thus one gets 



where f 2 is divisible by f,, and on the other hand f 2 is an h c / of the 
elements of A' Thus after n steps one gets 



where e 1+1 is divisible by e , 

Theorem Every matrix A is congruent to a diagonal-matrix (2) where 
t ltl is divisible by f,, for 1 = 1, , n — 1, and the congruence can be 

generated by row and column-additions The elements f, are uniquely 
determined up to unity factors, and • 

8 q = £ i for q = 1, , n (3) 

Proof The first statement of the theorem has been proved already 
The minors of any degree q of A are either equal to 0 or equal to products 
of diagonal elements ± £ i+i, e,+t where the t’s are non-negative Now 
ci+tj is divisible by e, Hence each of these determinants is divisible 
by e x £q, and since the minor which is situated in the left upper corner 
is equal to that product, formula (3) holds A unity factor remains 
arbitrary for each S q , from (3) follows that 

£ i| = S (1 8q-i (3 ) 

Thus the e’s are determined up to a unity factor This factor can be 
chosen arbitrarily, because a multiplication of the rows of (2) with unity 
elements is a congruence. Hence the theorem 

The elements e x , . , « n are called the elementary divisors* of A, 
and (2) is the canonical form of A 

♦ 

* The notation is not uniform , some authors give this name to the factors e t • e t ., 

69 0. P.— 36 
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Corollary . If in a diagonal-matrix D of R( A, n), every element is 
divisible by the preceding one, then D is a canonical form. 

This corollary follows immediately from the uniqueness stated in the 
theorem 

Let a, (3, , /a be m elements of A which are relatively prime to one 

another , then the following matrices have the same elementary divisors 
as the canonical forms given below them 

C!) C ,) 

C ') r ) (' )«.,>« 

\ 0 / \ a/3 / \ «» / 

The alteration leading from the upper to the lower line can therefore be 
performed by row- (column-) addition and multiplication with a unity This 
shows that by these operations a matrix of degree r 


a 





1 


0 

1 




can be transformed into A r == 


1 


1 a 



a T 


which is a canonical form Moreover 



and from the last pair in (4) it follows that when r x > r^ > . . > r p , 
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which again is a canonical form , 3 fi = r is equal to the degree of the 
matrix Form now such matrices for the m different elements a, p, . . , ja 
and put them together to a matrix of degree 3 r< = n, where r 1 = % rlj, 

i 

for j = 1, . . , m and rq > t> 2 > . .. > r'pj 


A = 



i 


i 

p 

i 



M 


One can replace the matrices in the diagonal by their canonical forms, and 
then use the second pair of matrices in (4), then one gets 


A 0 = 



<Pi 


^ , where 


1 2 ra 

<p, =« r i p r i y r i and ri, — 0 if i > p, Again, this form is a canoni- 
cal form This example is important for 6-44 


6-43 Congruence by matrix multiplication It has been shown in 
l-( 11)12 that row and column- additions can be performed by multiplication 
with certain elementary matrices E rs (A)> where det E rs (A) = 1, and 
similarly row and column-multiplications, by multiplication with diagonal- 
matrices composed of the multipliers , the multiplication with unities 
is therefore made by diagonal-matrices of unities The determinants of these 
matrices are unities, and the same holds therefore for their products. 
Hence A B implies B = UjAIJj, where detU 1; and detU 3 are unities. 
It will be shown now that this condition for congruence is also sufficient. 
Since A can be transformed by row and column- additions into a diagonal- 

matrix, A = 7T i D 7T i< where 7Ti and Tfi are products of elementary 

matrices, and D is the canonical form of A. Since det / Jfi— det 77 "j = 1> 

eo is detA = detD = /7"e, In the particular case when A = U, where 
1 

detU is a uni ty, 'Jf «i is a unity and therefore e&oh elementary divisor «i 
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is a unity Hence U is a product of elementary matrices and a 
diagonal-matrix of umties Let now detU 1 and detU 2 be umties and 
B = Ui A U 2 , then B is generated by multiply mg A from both sides 
with elementary matrices and diagonal-matrices of unities , hence one can 
obtam B by performing row and oolumn-additions and row and column- 
multiplications with unities Thus A and B are congruent, and the follow- 
ing theorem holds 

Theorem A B, if and only if 

B = UjA U 2> (1) 

where the determinants of U, and U 2 are unities 

If detU = u is a unity, then there exists in i?(A, n) a matrix U' 1 
and det (U 1 ) = a -1 , thus U,' 1 B U/ 1 = A is an equation of the same 
type as (1) 


6-44 The ring R(K[x], n) The ring of polynomials K[x] over the 
field K is a Euclidean domain , its unities are the polynomials of degree 0, 
that is the elements of K which are different from 0 (see 2-47) Thus 
one can apply the theory of the elementary divisors to the matrices over 
Kfx], and the theorems of 6-42 and 6-43 furnish immediately 

Theorem 1 Every matrix B of B(K[x] n,) can bo represented by 


B = B, 


1 ei 


B,, 


( 1 ) 


I! 


e 


ii 


where' the determinants of B x and B 2 are elements ^ 0 of K, and the 
elementary divisors e 1 , . , e„ are polynomials of K[x], each being a factor 
of the following polynomials The elementary divisors are uniquely deter- 
mined up to factors which are elements ^ 0 of K and which can be chosen 
arbitrarily The products . . e q = S y are the determinant divisors 

Let 4 be an extension of K , then every matrix B over K[x] is also 
a matrix over A[x], and it admits also a representation by elementary 
divisors Every h.c f of polynomials of K[x] is also an he f of the same 
polynomials in A[x], but for the hc.f in A [a:] the arbitrary factor is an 
element of A, whereas in K[x], only a factor 0 of K is arbitrary. 
Hence the determinant divisors S x , . . S B are the same in both the cases, 
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up to a factor out of A which is arbitrary. If one multiplies a polynomial 
of K[x] with an element of A which does not belong to K, then all the co- 
efficients =£ 0 of the products are elements of A not belonging to K. Hence 
if S 4 is a determinant divisor of B which has been obtained by considering B 
as a matrix over A [a:], and if one of the coefficients 0 of 8„ belongs 
to K, then is a polynomial of K[x], and therefore 5 q is also the q th 
determinant divisor when B is considered as a matrix over K[x] The 
same holds for the elementary divisors, as these are quotients of determi- 
nant divisors Hence 

Theorem 2 Let B be a matrix of f?(K[x], n) and ADK If in 
R(\[x], n), 



where e 1( , e„ are polynomials in A[x], with coefficients of the highest 

terms ± 1, and e, is a factor of r k in a[x] for l < k, then e,, , e n are 

polynomials m K[x] and are the elementary divisors of B m B(K[x], n) 

This theorem will be applied now to 

B = A - x E, (2) 

where A is a matrix over K As in 6-33, denote by A an extension of K 
which contains all the roots of 


Xa(*) = (At - *) ri (A 2 - x) r > (Am - *) rm Now 

) C, (3) 

where C is a matrix over A, and L t , . , L m have the same significance 
as m 6-33, (8) Then 

I Lj — xE, 

A — xE = C- 1 C, 



L m xE m 
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where E| is the unit-matrix of degree rj. Since C is a matrix over A, 
it is a unity matrix of R(A[x], n) In this ring, therefore 



Put A.j — x = a, — x = /?, . , K m — x = m, then these m elements 
of A [x] are relatively prime to one another One can therefore use the nota- 
tions introduced at the end of 6-42, and one obtains the result 


A — xE A° = 





(4) 


r ^l 

where <f>i = a 




<in(x) has the same meaning here as m (6-35,13) The polynomials <p,(x) 
are therefore the elementary divisors of A — xE , the coefficients of the 
highest terms are ± 1 Hence 


Theorem 3 If A is a matrix over K, then the polynomials <)>\(x) of 
6-35, (3) are the elementary divisors of A — xE and are polynomials over K 

This theorem supplements the considerations of 6-33, 6-34 and 6-35 
The canonical form A (see 6-34) determines uniquely the representation of 
the characteristic polynomial Xa( x ) as a product of polynomials <f> k (x) 
(see 6-35, (3), (4) and (5)) These polynomials have been introduced as 
polynomials in A[x], where A is an extension of K admitting the complete 
reduction of XaM Now it has been shown that the polynomials belong 
to K[x] On the other hand, the degrees r' k of the submatrices H' k of 
the canonical form of A are equal to the multiplicities of the roots of the 
polynomials <p k (x) Thus the elementary divisors of A — xE determine 
uniquely the canomcal form of A up to an isomorphism of the field A over K 

6-46* The ring R(J, n) The integral numbers form a Euclidean 
domain J , its unities are + 1 and — 1 From 6-43 it follows therefore 
that every matrix C with integral elements can be written in the form 

C = Ej D Ej, (1 ) 


* Can be omitted at the l*t reading 
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where detE x = dr 1, detE 2 = ± 1, and D is a canonical form of C. Let 
D' and D" be diagonal-matrices with diagonal-elements dr 1 only, then 
E 2 = D' 2 = E. In particular one can easily choose these matrices m such 
a manner that detE, D' = detD^Ej = 1, and that the elements of D'DD* 
are either all positive, or the first is negative and the others are positive. 
Since C = E X D', D'DD", D"E 2 , there is no loss of generality in supposing 
that in (1), 



where for i = 1, 2, . , n, e, > 0, e, is divisible by cj_! Put E., E x = E', 
then 

C - E x -> CE,= DE', detE' = 1 (3) 

Applying the transformations corresponding to the matrices of E(J, n) to 
any given vector | of an n-dimensional Euclidean vectorspace, one obtains a 
modul 


a i 1 1 + + ®„ In. (4) 

where a lt . , a u run over all the integral numbers The endpomts of these 
vectors form an w-dimensional lattice L Every matrix C of R(J, n) maps L 
on itself, but the mapping is not a (1, 1) -representation unless detC = ± 1. 
Similar matrices represent the same transformation for different bases. 
Hence by (3), every linear transformation of a lattice can be composed 
of a transformation E' of the lattice into itself and a transformation D. 
To every basis of a sublattice 8 there corresponds a particular “mesh” which 
is “spun out” by the vectors of the basis , S is generated by parallel dis- 
placements of the mesh On the other hand to every mesh there corres- 
ponds a sublattice S, and there exist linear transformations such that 8 is 
the image of L By E', the mesh spun out by the original basis of L is 
transformed into another mesh corresponding to L and by D this mesh 
is transformed into a mesh corresponding to the lattice 8 which is the image 
of L for the transformation C Hence the theory of elementary divisors 
shows that if S is any sublattice of L, then there exist meshes M x and M. 2 
corresponding to 8 and L respectively, and one gets M., by multiplying the 
edges of M x with tj, . , £ n respectively Since these numbers are ele- 
mentary divisors, each of them is a factor of the following ones. 
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6-5 Matrices and forms 

A column-vector (x), when multiplied from the left aide with an arbitrary 
matrix, is transformed again into a column-vector A(x) = (x) Similarly 
a row -vector (x) T is transformed into a row- vector by multiplication from 
the right side Thus a product of matrices 

(*) T My) (i) 

is a matrix of which every element is necessarily equal to zero except the 
first element m the first row, and this can easily be stated to be equal to 

f(x, y) = 2 a'k yk (2) 

Every bilinear form m x and y can be written in the form (2) Hence one 
can apply the theory of matrices to bilinear forms E g the transformation 
of the bilinear forms can easily be expressed by matrices Suppose 
(x) = B(n), (y) — C(v), then (x) T = (m) t (B) t , and (x) T A (y) = (n) T B T AC(ti) 
Put 

G = B T AC, (3) 

then 

f(x, y) = 5 g\ v „ (4) 

Let in particular, (x) and (y) be the expressions for two vectors of a 
vectorspace when a particular basi^ is used, and let (u) and (v) be the same 
vectors expressed by a different basis, then B = C, and therefore 

G = B T A B (3') 

This case is of particular interest , for instance it occurs when (x) = (y) 
Thus the linear transformation of a quadratic form 

f(x, x) = X «‘k *1 *k 9 % = g{n, U) (5) 

can be obtained by the help of (3') The vectors (x) and (y) can also be 

interconnected in a different manner E g the two vectors can be supposed 

to be conjugate (see 6-51), or it can be postulated that when the (x) vector 
is transformed, the (y) vectors are transformed m such a way that a certain 
bilinear form remains invariant 

6-51 Unitary matrices 

From 6-5, (3) it appears already, that the theory of similarity of matri- 
ces (see 6-3 and sub-sections) can be applied to the transformation of a 
quadratic form, when B T = B 1 Matrices with this property therefore 
promise to be particularly interesting. It is however preferable to use 
here a slight generalisation. 



UNITARY MATRICES 


289 


Consider two fields K and A , as in 2-742 and in 3-33, it is supposed 
that [A • K] = 2 and that therefore A consists of pairs of conjugate 
elements. For these pairs, the same suppositions are made here, as m 3-33, 
and therefore the theorems established there can be applied here It is 
convenient to introduce the notation M for a field 


either M = K, 
or = A 


( 1 ) 


to avoid repetition of the same considerations If M = K, conjugate 
elements are equal, a = a, and 5 a = a - , if M = A, a distinction be- 
tween conjugate elements must be made The most important case for 
applications is, when A is the field of the complex numbers and K the field 
of the real numbers The conditions of 3-33 are obviously satisfied for 
this pair of fields 


Let (a\) = A, then denote 

(«'k) = A (2) 

Since the interchange of conjugate elements is an automorphism (m the 
case when M = K, it is even the identity), for every polynomial /(x) of M[x], 

/(A) = /(A) (3) 

In particular A ' 1 = A' 1 Moreover the conjugate matrix to the trans- 
posed is the transposed of the conjugate, and detA = det\ Denote 

A t = A T = A* (4) 

Obviously (A t ) t = (A) = (A*)* = A, and from (4) it follows that 
(A T ) + = (A + ) T , and (A)* = A* The elements of A" 1 are A' k detA, 
and since detA T = detA, and the elements of (A T ) _1 are A, 11 detA Hence 
(A ' 1 )' 1 = (A' 1 ) 1, and — taking the conjugate — (A*) -1 = (A -1 )* All these 
statements are comprehended in the following lemma. 

Lemma The symbols T, 1 as exponents of a matrix are 

commutative 


Apply these symbols to a product of two matrices, and record 
AB = AB 

(AB) T = B T A T , (AB)- 1 = B- 1 A' 1 , (AB)* = B* A* 


( 5 ) 
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Definition. U is said to be a unitary matrix over M if 

U* = U- 1 . (6) 

Let U, U lf U 2 be unitary matrices of degree n over M Then U -1 and 
Uj U 2 are unitary ; indeed (U" 1 )* = U* * = U = (U 1 ) 1 , and (U 2 U 2 ) + = 
U 2 * U 1 + = U 2 -1 U-, -1 = (U, U 2 ) -1 . In particular, the matrix E = E* = E 1 
is a umtary matrix (This shows that unitary matrices really exist) 
Hence the unitary matrices of degree n over M form a system 

G u (n, M) (7) 

with the following properties 

1 There exists an associative composition of the matrices, such that if U 2 
and U 2 belong to (7), then also Uj U 2 belongs to it 

2 f? u (n, M) contains a unit-element E, such that UE = EU = U holds 
for every element U of it 

3 To every U of G u (n, M), there corresponds an inverse element U 1 in 
O u (n, M) satisfying U U" 1 =U -1 U = E. 

Apply the notation already introduced in 6-11, and express these 
statements by 

Theorem 1 The matrices of C? u (n, M) form a group (The group 
of the unitary matrices of degree n over M). 

From the lemma and (6) it follows immediately that when U belongs 
to (7„(n, M), then, besides U _1 , also U, U T and U* belong to it Moreover 
the “norm” 

detU (detU) = 1 (8) 

for every umtary matrix, since detU = detU = detU + = det(U~ 1 ) = 
(detU) -1 . 

Express (6) in terms of elements «‘ k Put UU* = C.' Then 
c' k = 2 w‘j it* j 

and since (6) means that C = E, one gets 

2 m'j = 0 for i =76 k (6') 

= 1 for i = k 
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as a system of equations which is equivalent to (6). Since U T is also unitary, 
mJ ( «i k = 0 for i^fck (6') 

= 1 for i = k. 

On the other hand, if U T is unitary, then U is Hence (6*) implies (6'). 
Hence (6), (6') and (6') are equivalent systems of necessary and sufficient 
conditions for the unitariness of U. A fourth form for these conditions is 

u\ = L 7 V detU ’ (6") 

It has already been mentioned that E is a unitary matrix, but it has 
not yet been proved that there exist unitary matrices which are not unit 
matrices As a matter of fact, one can choose the first row of a unitary matrix 
nearly arbitrarily It has been proved already m 3-33, th 2 that given any 
system v u ,v u =fcO, , 0 of elements of M, one can find n J elements tt' k 
satisfying (6') such that u\ = A v k (k = 1, . , n) , obviously one can 
determine the w' u also in such a way that u k 1 = A v k In terms of matri- 
ces this means 

Theorem 2 Let v t , , v„ be n elements of M but different from 
0, , 0, then there exists a unitary matrix, the first row (column) of 

which differs from v t , . , v n by a factor A only which is a suitably chosen 
element of M. 

It may be remembered that in 3-33, the factor A was determined by 
the condition N( A) = 2 v, v t Therefore A can be replaced by a where 
a is any element with a of = 1 In particular A can always be replaced 
by — A 

If U and Ui are unitary, then 

C.) 

sa tisfi es the conditions (6') and is therefore unitary In particular 
is unitary. 

Consider now a vectorspace, say V of rank n over M, and distinguish 
in it a basis 

( 10 ) 

Let U be a unitary matrix of degree n over M and U be the transformation 
of V which one gets by applying U to the basis (10). Since there exists 
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an inverse transformation of U, the basis (10) is transformed by U into n 
independent vectors 

«'i, • • . (10') 

which form a basis of V expressed m terms of the basis (10) 

«'k = (^k. • . « n k ), 

» e the vectors (10') are expressed by the columns of the unitary matrix 
U Let U run over all the matrices of <?„(n, M) then (10') runs over 
all the systems of n vectors for which the scalar products satisfy the 
conditions 

e'i c'i = 1 , e'| e' k = 0 , for l^k (11) 

Thus given any basis (10), the group G„(n, M) generates a system 2 of 
bases of V, the coordinates of which (expressed by (10)) satisfy the conditions 
(11) Let (10) be transformed into (10') by U and into e" lt . . , e" n by U, 
then U, U 1 transforms e' k to e", Now U, U 1 is a unitary matrix , 
hence if one applies the unitary matrices to any basis of the system 2, one 
gets the same system 2 of bases A transformation of V which is obtained 
by applying a unitary matrix to a particular basis of the system 2 is called 
a unitary transformation It must be emphasised that for unitary transfor- 
mations the bases of V are not all equivalent, but particular systems of 
bases are distinguished Let /3 ]t . , /?„ be an arbitiary basis (see 6-2) 
and p k = 2 b\ e 1( then the unitary transformations of V are expressed 

i 

by the /J-basis by the matrices 

B-i U B, 

and these matrices are in general not unitary On the other hand a trans- 
formation which is represented by a unitary matrix when the /3 - basis is used, 
might be expressed by a non-umtary matrix when the original basis is used, 
and therefore it will not be a unitary transformation. 

6-511. Orthogonal matrices Consider now the particular case when 
M = K. Then A = A, A T = A* Unitary matrices are ' called orthogonal 
in this case. From 6-61, (6), (6'), (6') and (8) it follows that if R is an 
orthogonal matrix, the following equations hold 

R T = Rr 1 , 2 r 1 ,, r\ — 2 r k i = 0 for i ^ j 

k k 

detR « ± 1, 2 (W = S <r k i)» = J. 


(1) 
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The n row-vectors, and the n column-vectors form therefore two 
“orthogonal systems” (see 1-7, Def 3). 

If in particular K is the field of the real numbers, then V is an n-dimen- 
sional Euclidean vector-space Let in this space 

e i . • , e n ( 2 ) 

be n mutually perpendicular vectors of equal length, say length 1 ; then 
this basis is transformed by R into n mutually orthogonal vectors of length 1; 
these n-tuplets form the system of bases which in 6-51 was called 2 The 
matrices of G u (n, K) represent the rigid motions of V if detR = 1 and 
the symriietnes if detR = — 1, when the basis (2) or an equivalent basis 
(tea basis of %) is used The rigid motions and symmetries of an 
n-dimensional Euclidean point-space are composed of the corresponding 
transformations of V and parallel displacements Theori m 2 of 6-51 shows 
that the direction of one vector of an orthogonal system can be arbitrarily 
chosen 

6-52 Symmetric and antisymmetric matrices 

Definition 1 A matrix S is called symmetric if 

S = S T (1) 

Definition 2 A matrix is called antisymmetric (or skew) if 

A = — A T . (2) 

These definitions are applied to matrices over arbitrary fields and 
integral domains In terms of elements, a symmetric matrix S is charac- 
terised by 

= « k i (1') 

Antisymmetric matrices over a field of characteristic 2 are symmetric (and 
conversely) , if the characteristic of K is different from 2, then an anti- 
symmetric matrix is characterised by 

®k = - a *i 

( 2 ') 

a\ — 0 

Let B be an arbitrary matrix over a field K of characteristic 2, then 

B = S + A, (3) 

where s\ — (6‘ k + 6 k i) 2, a 1 * = (b' k - & k t ) 2. 
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It is easy to see that the representation (3) of B is possible in this way only. 
Exercises. (1) Every antisymmetric matrix of rank r is similar to a 
^ , where A is antisymmetric and of degree r. 

(2) . The rank of an antisymmetric matrix is necessarily even. 

(3) Let A be antisymmetric and of degree 2n, and let the m = n(2n— 1) 

elements of A above the diagonal be mdeterminates y t y m ; then 

detA = f 2 (y u . . , y m ), where / is irreducible. 

6-53 Hermitean matrices Here again, the notations K, A, M are 
used as m 6-51 

Definition A matrix N over M is said to be Hermitean over M if 

H* = H (1) 

and the roots of XhM belong to M If U is unitary and H is Hermitean 
over M, then H t = U’H U is also Hermitean over M Of course it 
follows from (1) and from 6-51, (5) and (6) that H*! = U + H^U' 1 )* = H t ; 
and Xh^*) — Xh(*) . hence the roots of this polynomial belong to M 
For the elements h' k of H 

A'k =>, (2) 

holds In particular the diagonal elements h\ — h\ are self-conjugate 
and therefore belong to K If all the elements of H belong to K, then H 
is symmetric If H 2 and Hj are Hermitean, then 



is Hermitean, and conversely. If in particular 

H ‘=C H') . 131 

is Hermitean, then \ belongs to K, and H' is Hermitean. From this remark 
will be deduced the following 

Theorem 1 If H is Hermitean over M, the roots of Xh(z) belong to K. 

Proof. Let X be any root of Xh(*)» then X belongs to M and there 
exists in the vectorspaoe V over M a vector £ = (ti lf . . ., t>„) such that 


matrix 


( 
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(H — AE) £ = 0. Let U, be a unitary matrix whose first column differs 
from by a common factor only (see 6-51, th. 2), then 

H x = Ur 1 H U x (4) 

transforms (1, 0 , . , 0) into (A, 0 , . ,0) and this must be the first column 
of H x Since H x is Hermitean, the first row must be conjugate to the first 
column Hence Hj has the form (3) , thus A belongs to K Hence .the 
theorem 


Applying this theorem to matrices over K, one gets immediately 

Theorem, 1' If S is symmetric over K, and the roots of Xs( x ) belong 
to A then these roots belong to K 

Moreover : 

Theorepi 2 If H is Hermitean over M, then there exists a matrix U 
which is unitary over M such that 


Ai 


H = U 


|T t -i 


(5) 


An 


where Ai, • •» An are the roots of the characteristic polynomial of H. 

Proof Let Ai be any root of Xn( x )> then it follows from (3) and (4) 


that H = Uj Hi Ur 1 , where H 


-r.) 


Let \j be a root of H'. 


Since H' is a Hermitean matrix of degree n — 1, there is a Hermitean 

matrix H', = ( | = UY 1 H' U' x , where H' is Hermitean and U', 

\ H" / 


is unitary and of degree n — 1 ; hence U 2 = 




is unitary and 


of degree n (see 6-51, (9)) From 6-13, (2) it follows that 

Ai 


H 


= u, u, ^ Al a 2 ^ ^ ur 1 


Repeat this step n times and put U„ . . U 2 U x = U ; then U is unitary 
and the theorem follows. 


Put M = K, then the unitary matrix U is an orthogonal matrix. 
From theorem 1' and theorem 2 it follows therefore immediately : 
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Theorem 2' If S is a symmetric matrix over K and the roots of Xs(*) 
belong to A, then 


S = R 


Ai 



( 6 ) 


where R is an orthogonal matrix over K. 


The condition that the roots of Xs( x ) belong to A can be omitted when 
A is known to be a closed field (see 3-8) Now the field of the complex 
numbers is a closed one, and the fields of the real and of the complex 
numbers form a pair of fields K, A satisfying the conditions of 6-51 Applied 
to this particular pair of fields, the theorems 1' and 2' furnish 

Theorem 3 If S is a symmetric matrix over the field of the real 
numbers, then the roots \ u , A. n of its characteristic polynomial are 
all real, and S can bo transformed by an orthogonal transformation into 
the diagonal-matrix 





(7) 


The equation Xs( x ) = 0 is often called secular equation on account of 
its importance for the theory of secular disturbances 


6-54 Hermitean and quadratic forms Extend the field M to a ring of 
polynomials, 

M[{x}] (1) 

by introducing 2n or n mdetermmates If M = A, there are 2n in- 
determinates 


•^1 i i *^n 
#1, . . •> # n 


( 2 ) 


If M = K, there are n indetermmates x lf . . ,x D only, but one may use the 
notation (2), and put = x { for i — 1, .... n. In both cases, x i and x, 
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are said to be conjugate Thus one can extend the automorphism inter- 
changing conjugate elements of M to an automorphism 

2 »> a •• » * 1 • • X^n X i • * •*'*’ n 

( 3 ) 

-< — >- 2 a a a v x a 1 . .x a n x° 1 .. x" n 


which interchanges conjugate elements of M as well as conjugate mdeter- 
mmates (2) When M = K, then (3) is the identity Let ( x ) be the column- 
vector with the elements x lt . . , x n , then (x) + is the row -vector with elements 
x x , . . x„. If H is Hermitean over M, 

F - ( x )* H (x) - F* (4) 

is a matrix, the first element of which is equal to 

/(x, x) = 2 x, x k = /(x, x), (5) 


whereas the other elements are equal to 0 The bilinear form (5) is called 
a Hermitean form In the particular case when M = K, the form is a 
quadratic form 

f(x, x) = X h\ x, x k = 2 h\ x\ + 2 X h\ x t x k (6) 

l<k 


On the other hand every quadratic form over a field of characteristic ^ 2 
can be expressed by (6) The fields K, A, M are all supposed to be of 
characteristic 0 (see 3-33) The theorems of 6-53 furnish therefore • 

Theorem 1 By a umtary transformation of x u , . , x n say (x) — » U(x), 
(x) + — » (x)* U\ every Hermitean form over A can be transformed into a 
canonical form over K 

2 a ( Xi x, (7) 

Theorem 2. By an orthogonal transformation of x t , x n every 
quadratic form over K can be transformed into a canonical form 

2 Hi x 2 |. (8) 

A transformation of x 1; . , x a by U means that the corresponding Hermitean 
matrix is transformed into a similar one Now two similar diagonal-matrices 
have the same diagonal-elements (only they may be arranged m a different 
order) Hence the coefficients eq m (7) and (8) are uniquely determined. 
They are the roots of the characteristic polynomial In terms of forms, 
the unitary matrices are characterised as follows . 

69 0. P.— 38 
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Theorem 3 A matrix U is unitary if and only if the form 

2 x \ ( 9 ) 

is invariant for the transformation (x) — r U(x), (x) T —> (x) T U + 

Proof The Hermitean matrix corresponding to (9) is E, the matrix 
of the transformed Hermitean form is therefore U + EU Now the two 
equations U* EU = E and U* =- U _1 are equivalent The first equation 
means that (9) is invariant for U, the second equation is the condition for 
U being unitary Hence the theorem 

6-541 The law of inertia for quadratic forms with real coefficients. 
Though there is one and only one canonical form 

2 a, *, 2 ( 1 ) 

m which a quadratic form over K can be transformed by orthogonal trans- 
formations, this canonical form might be simplified by non-orthogonal 
transformations Indeed, put x, = 6, x\, then (1) is transformed into 
2 a \ x ' 2 u where a', = a, h- { Hence the coefficients a, take factors which 
are arbitrary squares in K If in particular, K is chosen as the field of 
the real numbers, the b's can be selected in such a way that a\ takes only 
the values + 1, <— 1 and 0 That there is no further reduction of the cano- 
nical form, is shown by the following theorem 

Law of inertia for quadratic forms Any quadratic form in x, , , x n 

with real coefficients can be transformed by a matrix of rank n with real 
elements into one and only one canonical form 

q(x) = x? + . . + a-p 2 - x ptl 2 — . . — x x - (2) 

where p^rgn 

Proof That a transformation of any form into (2) can be performed, 
has been shown already The number r is equal to the rank of the matrix 
S which corresponds to (2), and it is also the rank of every matrix A T S A, 
when A is of rank n Hence r is an invariant for the transformations under 
consideration. It remains to show that p is an invariant Suppose 

?i(z) = Zx + • • + v - Zqu 2 - • . - Z r 2 (2') 

X{ == b\ z x + . . + h\ z r > Zi = c'p x x -j- -f c‘ r x T , for i = 1, . . ., r Then 
q(x) = Ji(z) for corresponding values of x and z. It has to be proved 
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that p = q Let p =jk q, say p > q (without loss of generality), and solve 
the q + r — p < r linear homogeneous equations . 

c k ! + . + c k r x v = 0 for k = 1, . , q 

x t = 0 for t = p + 1, . , r 

These equations have a solution (£,, . , 0, , 0) =£ (0, . . , 0) In terms 

of z the equations (3) are expressed by 

2 b = 0 for s = 1, , q 

' i'l + + fei r z r = 0 for j = p + 1, , r 

The corresponding solution is (0, , 0, £ q+1 , , £ r ) Now q{£) > 0, 

q,(£) ^ 0, contrary to gr(|) = q,{£) Hence the theorem. 

A quadratic form with real coefficients is called positive definite if m the 
canonical form (2), there is n = r = p , if however n = r, p = 0, it is said 
to be negative definite , when n > r and p = r or 0, it is semi-defimte, and 
when r > p > 0 it is indefinite 

6-542 Applications to Geometry The theory of matrices and of 
quadratic forms admits a large number of applications to Geometry of which 
a few may be mentioned here A linear transformation of x u . , x n means 

a colhneation of an (n — l)-dimensional projective space when homoge- 
neous coordinates are used, and it means an affinity with fixed origin of an 
n-dimensional space when non-homogeneous coordinates are used An 
orthogonal transformation has to be interpreted as a rigid motion or a 
symmetry (according as the determinant is + 1 or — 1) of a metrical 
n-dimensional space the origin remaining fixed If one multiplies any column 
of an orthogonal matrix with —1, it remains orthogonal, and its determinant 
changes sign It is therefore possible to transform a quadratic form with 
real coefficients into the canonical form by an orthogonal transformation 
with deteminant -f- 1 In a projective (n — I) -dimensional space, an 
orthogonal transformation means a colhneation for which the quadric 
x\ + . -f- x 2 n = 0 remains mvanant Forms occunng m Geometry are 
either completely determined, or they are only the left side of an equation 
where the right side is zero , m the latter case, they are determined up to 
an arbitrary factor 0 only. By these considerations, the preceding 
result furnishes easily the complete classification of the quadrios of an n- 
dimensional metric space as follows 
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1. a 1 a^ 3 -j- ...-{- ®o x \ + 1 — 0 

2 <*! a: 2 , -f . .. + «„_! *„_!* + *„ = 0 (1) 

3, «i **i + . •• + o a 

In the first case, the a’a are uniquely determined, in the second, a factor ± 1 
remams arbitrary and in the third case a real factor of the o’s is undeter- 
mined. 

For the affine space, there are also the canonical forms (1) but the a’s 
are supposed to take the values + 1, — 1, 0 only For the projective 
(n — 1) -dimensional space, the canonical forms are 

* 2 i + • • + x p 2 ~ *p«* - • - ** r (2) 

with r/2 2 p 5 r 5 n, 

6-55. Bilinear forma with contragradient indeterminate a At the be- 
ginning of 6-5, it has been shown that a bilinear form f(x, y) — % a‘ k x, y k 
can be represented by a matrix product (x) T A (y), and that the transforma- 
tions (x) = B(m), (y) = C(a) generate a transformation 

A B T AC. (1) 

This formula has been applied to cases when ( x ) and (y) are either identical 
or conjugate, but they may also be linked together in a different manner. 

(1) The vectors (x) and (y) are bound to be transformed m the same 
way ; then they are said to be cogradient In this case, B = C and therefore 

A B T AB (2) 

A pair of pomts of an n-dimensional Euclidean space is an instance of two 
cogradient vectors, but for “point” one may take also any other geometrical 
entity expressed by coordinates 

(2) The vectors ( x ) and (y) are bound to be transformed in such a 
way that the bilinear form x t y t -f- . + x n y n remams invariant Then 
B T EC = E, and therefore B T = C _1 . Hence 

(x) (CO' 1 (x), (y) C (y), A C'* AC. (3) 

In this cas the vectors (x) and (y) are said to be contragradient. 

E.g. the pomts and the straight lines of a projective plane are con- 
tragradient. The distinction between oogradient and contragradient vectors 
is fundamental for Tensoralgebra which forms the basis of analytic and 
differential geometry 
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